Vol. 48, Parts 1 and 2 





~ BIOMETRIKA 


FOUNDED BY 


W. F. R. WELDON, FRANCIS GALTON anp KARL PEARSON 


MANAGING EDITOR 


KE. S. PEARSON 


ASSOCIATE EDITOR 
J. DURBIN 





ISSUED BY 
THE BIOMETRIKA OFFICE, UNIVERSITY COLLEGE LONDON 














[Iesued June 1961] 








This volume of Biometrika is published with the co-operation of 


F. N. DAVID N. L. JOHNSON 
M. J. R. HEALY D. V. LINDLEY 
R, L,. PLACKETT 





A volume containing about 500 pages will be published annually in two half-yearly 
issues, appearing in June and December. 


Papers for publication should be sent to 
PROFESSOR E. S. PEARSON 
University College London, Gower Street, London, W.C. 1 


It is a condition of publication in Biometrika that the paper shall not already have 
been issued elsewhere, and will not be reprinted without leave of the Editors. 


Contributors receive 50 copies of their papers free. Joint authors 25 copies each. 
Order forms for separates are sent to authors with proofs of their papers. 





- 


SUBSCRIPTIONS for 1961 (Volume 48) 
The subscription price, payable in advance is: 
£2. 14s. (or $8.00) net per volume including packing and postage. 


Subscribers who buy the journal Biometrika for their own and not for general use 
may apply to the Secretary to be registered as Personal Subscribers, for whom the 
subscription price is £2. 5s. (payable only in sterling). 


Cheques should be made payable to Biometrika, crossed ‘a/c BIOMETRIKA TRUST’ 
and sent to THE SECRETARY, BICMETRIKA OFFICE, UNIVERSITY 
COLLEGE LONDON, GOWER STREET, LONDON, W.C. 1, to whom all 
orders for Offprints, Index and Table Separates (see pages (iv) to (vi)) should also be 
sent. For particulars as to the availability of Back Issues please see inside back cover. 


[Members of the Institute of Mathematical Statistics are reminded that they may 
subscribe to Biometrika through the Institute who will accept subscriptions to Biometrika 
at the special rate of $6.50 if paid to the Treasurer of the Institute before 1 March.] 








Pr 





ey 











Biometrika (1961), 48, 1 and 2, p. 1 


Printed in Great Britain 


Studies in the history of probability and statistics 


XI. Daniel Bernoulli on maximum likelihood 


By M. G. KENDALL 


Research Techniques Division, London School of Economics and Political Science 


1. Almost as soon as the calculus of probabilities began to take a definite shape mathe- 
maticians were concerned with the use of probabilistic ideas in reconciling discrepant 
observations. James Bernoulli’s Ars Coniectandi was published in 1713. Within 9 years 
we find Roger Cotes (1722), in a work on the estimation of errors in trigonometrical men- 
suration, discussing what would nowadays be described as an estimation problem in a 
plane. Let p, q, r, s be four different determinations of a point 0, with weights P, Q, R, 8 
which are inversely proportional to distance from o ( pondera reciproce proportionalia spatiis 
evagationum). Put weights P at p, etc., and find their centre of gravity z. This, says Cotes, 
is the most probable site of 0. (Dico punctum z fore locum obiecti maxime probabilem, qui pro 
vero eius loco tutissime haberi potest.) Cotes does not say why he thinks this is the most 
probable position or how he arrived at the rule. 

2. According to Laplace this result of Cotes was not applied until Euler (1749) used it 
in some work on the irregularities in the motion of Saturn and Jupiter. Further attacks on 
the problem of a somewhat similar kind were employed by Mayer (1750) in a study of lunar 
libration and by Boscovich (1755) in measurements on the mean ellipticity of the earth. 
There was evidently a good deal of interest being taken in the combination of observations 
about the middle of the eighteenth century. The ideas, as was only natural, were often 
intuitive and sometimes obscurely expressed, but the fundamental questions seem to have 
been asked at quite an early stage. For example, Simpson (1757) refers to a current opinion 
that one good observation was as accurate as the arithmetic mean of a set, and although 
from that point onwards a series of writers argued for the arithmetic mean, Laplace (1774), 
in his first great memoir, was clearly aware that for some distributions of error there were 
better estimators such as the median. 

3. Simpson (1756, 1757) was the first to introduce the concept of distribution of error 
and to consider continuous distributions. But like most of his contemporaries he regarded 
it as inevitable to impose two conditions: first, the distributions must be symmetrical; 
secondly, they must be finite in range. Lagrange reproduced Simpson’s work without 
acknowledgement in a memoir published between 1770 and 1773, but Lagrange’s con- 
tributions are more of analytical than of probabilistic interest. 

4. Daniel Bernoulli was born in 1700 and lived to be 82. Throughout his productive life 
he made contributions to the theory of probability and although his mathematical methods 
are not now of much importance, the originality of his thinking on such matters as moral 
expectation entitles him to a permanent place among the founders of the subject. In 
particular, the memoir on maximum likelihood reproduced in the following pages is 
astonishingly in advance of its time. The author was 78 when it was published and it 
appears that he excogitated the basic ideas for himself without reference to previous 
writings. The memoir may, in actual fact, have been written rather earlier. Laplace’s 


I Biom. 48 





2 M. G. KENDALL 


article of 1774 refers to manuscripts of Bernoulli and Lagrange which he had heard of but 
not seen. An announcement of their existence, says Laplace sublimely, reawakened his 
interest in the subject. Laplace was 25 at the time. 

5. Iam much indebted to my colleague Mr C. G. Allen for the translations of the articles 
by Bernoulli and Euler which follow. They are, I felt, of sufficient interest to justify the 
publication of an English version, especially Bernoulli’s. The reasoning is so clear that 
I can leave Daniel to tell his own story, but perhaps I may direct attention to two points: 

(a) Influenced by the belief that an error distribution must have a finite range, Bernoulli 
runs into trouble with the parameter determining that range. He assumes a semi-circular 
distribution and lays down the peculiar condition that any distribution must be abrupt at 
its terminals. Once this is done, however, his formulation of maximum likelihood is clear 
and explicit and he derives what would nowadays be called the ML equations by differ- 
entiating the likelihood of the sample. 

(b) In § 16 he is right on the verge of a principle of minimal variance. In comparing two 
methods of estimation he points out that one (the ML method) gives samples which are 
closer to the true value than the other. 

6. The commentary by Euler seems to me of less value. He points out, correctly in my 
opinion, that the ML principle is arbitrary in the sense that there is no logical reason to 
believe that observations come from a generating system which gives them the greatest 
probability. (Bernoulli admits that his reasoning on this point is metaphysical, but at least 
he does reason about it.) Euler then goes on to propound principles which seem to me to 
be much more open to doubt than the one he is trying to replace. His examples at the end, 
in which he has to manoeuvre his error-range to avoid imaginary solutions, ends rather 
lamely with the conclusion that it doesn’t matter much anyway. However, it is always of 
interest to read what a great mind has to offer on a subject. Nor should we forget, perhaps, 
that at the time of publication Euler himself was 71 and had been blind for 10 years. 


REFERENCES 


Boscovicu, R. G. (1755). (In Maire, C. and Boscovich.) De litteraria expeditione per Pontificiam 
ditionem ad dimetiendos duos meridiani gradus. Romae. 

Corrs, R. (1722). Aestimatio errorum in mixta mathesi, per variationes partium trianguli plani et 
spherici. Opera Miscellanea, Cantabrigiae. 

Ever, L. (1749). Piéce qui a remporté le prix de ’ Académie Royale des Sciences en 1748, sur les 
inégalités du mouvement de Saturne et de Jupiter. Paris. 

LAGRANGE, J-L. (1770-3). Mémoire sur lutilité de la méthode de prendre le milieu des résultats de 
plusieurs observations etc. Miscellanea Taurinensia, 5, 167. 

LaPuLace, P. 8. (1774). Déterminer le milieu que l’on doit prendre entre trois observations données 
d’un méme phénoméne. Mém. Acad. Paris (par divers savants), 4, 634. 

Mayer, T. (1750). Abhandlung tiber die Umwiilzung des Mondes um seine Axe. Kosmographische 
Nachrichten und Sammlung. 

Simpson, T. (1756). A letter. ..on the advantage of taking the mean of a number of observations in 
practical astronomy. Phil. Trans. 44, 82. 

Simpson, T. (1757). An attempt to show the advantage arising by taking the mean of a number of 
observations in practical astronomy. Miscellaneous Tracts, London. 











t] 


ee eee ee ee ee ee ee ee ee ee. ee ee eo) 








Studies in the history of probability and statistics 3 


The most probable choice between several discrepant observations and 
the formation therefrom of the most likely induction 


By DANIEL BERNOULLI} 
Translated by C. G. Allen 


British Library of Political and Economic Science, London School of 
Economics and Political Science 


1. Astronomers as a class are men of the most scrupulous sagacity; it is to them therefore 
that I choose to propound those doubts that I have sometimes entertained about the 
universally accepted rule for handling several slightly discrepant observations of the same 
event. By this rule the observations are added together e7d the sum divided by the number 
of observations; the quotient is then accepted as the tiue value of the required quantity, 
until better and more certain information is obtained. In this way, if the several observa- 
tions can be considered as having, as it were, the same weight, the centre of gravity is 
accepted as the true position of the objects under investigation. This rule agrees with that 
used in the theory of probability when all errors of observation are considered equally likely. 

2. But is it right to hold that the several observations are of the same weight or moment, 
or equally prone to any and every error? Are errors of some degrees as easy to make as others 
of as many minutes? Is there everywhere the same probability? Such an assertion would 
be quite absurd, which is undoubtedly the reason why astronomers prefer to reject com- 
pletely observations which they judge to be too wide of the truth, while retaining the rest 
and, indeed, assigning to them the same reliability. This practice makes it more than clear 
that they are far from assigning the same validity to each of the observations they have 
made, for they reject some in their entirety, while in the case of others they not only retain 
them all but, moreover, treat them alike. I see no way of drawing a dividing line between 
those that are to be utterly rejected and those that are to be wholly retained; it may even 
happen that the rejected observation is the one that would have supplied the best correction 
to the others. Nevertheless, I do not condemn in every case the principle of rejecting one 
or other of the observations, indeed I approve it, whenever in the course of observation an 
accident occurs which in itself raises an immediate scruple in the mind of the observer, 
before he has considered the event and compared it with the other observations. If there is 
no such reason for dissatisfaction I think each and every observation should be admitted 
whatever its quality, as long as the observer is conscious that he has taken every care. 

3. Let us compare the observer with an archer aiming his arrows at a set mark with all 
the care that he can muster. Let his mark be a continuous vertical line so that only devia- 
tions in a horizontal direction are taken into account; let the line be supposed to be drawn 
in the middle of a vertical plane erected perpendicular to the axis of vision, and let the whole 
of the plane on either side be divided into narrow vertical bands of equal width. Now if the 
arrow be loosed several times, and for each shot the point of impact be examined and its 
distance from the vertical mark noted on a sheet, though the outcome cannot in the least 
be exactly predicted, yet there are many assumptions that can reasonably be made and 


t+ This memoir and the following commentary by Euler appeared in Latin in the memoirs of the 
Academy of St Petersburg, Acta Acad. Petrop. (1777), pp. 3-33. A photostat copy has been deposited 
in the library of the Royal Statistical Society. 


1-2 








4 M. G. KENDALL 


which can be useful to our inquiry, provided all the errors are such as may easily be in 
one direction as the other, and their outcome is quite uncertain, being decided only as it 
were by unavoidable chance. In astronomy, likewise, anything which admits of correction 
a priori is not reckoned as an error. When all those corrections have been made which theory 
enjoins, any further correction which is necessary in order to reconcile the several slightly 
discrepant observations which differ slightly from each other is a matter solely for the theory 
of probability. What in particular happens in the course of observation, ex hypothesi we 
scarcely know, but this very ignorance will be the refuge to which we are forced to flee when 
ve take our stand on what is not truest but most likely, not certain but most probable 
(non verissimum sed verisimillimum, non certum sed probabilissimum), as the theory of 
probability teaches. Whether that is always and everywhere identical with the usually 
accepted arithmetical mean may reasonably be doubted. 

4. Errors, which are unavoidable in observation, may indeed affect individual obser- 
vations; nevertheless, any given observation has its own rights and could not be impugned 
if it were the only one that had been made. Any observation must therefore be in itself 
sound and good, and no-one ought to assign any other value than that ascertained thereby; 
but since they are mutually contradictory, a value has to be assigned to the whole complex 
of observations without touching the parts. In this way a definite error is attributed to the 
individual observations; but I think that ofall the innumerable ways of dealing with errors 
of observation one should choose the one that has the highest degree of probability for the 
complex of observations as a whole. 

The rule which I here propound will be accepted by all, provided that the degree of 
probability in respect of a given observation can be defined in terms of a point which is 
assumed to be true. I freely admit that this last condition has not been definitely met; at 
the same time I am convinced that all things are not equally uncertain and that better 
results can be got than can be expected from the commonly accepted rule. Let us see if 
certain assumptions should not properly be made in this argument which contribute some- 
thing to a higher probability. I will begin the examination with some general considerations. 

5. If the archer whom I mentioned in §3 makes innumerable shots, all with the utmost 
possible care, the arrows will strike sometimes the first band next to the mark, sometimes 
the second, sometimes the third and so on, and this is to be understood equally of either side 
whether left or right. Now is it not self-evident that the hits must be assumed to be thicker 
and more numerous on any given band the nearer this is to the mark? If all the places on 
the vertical plane, whatever their distance from the mark, were equally liable to be hit, the 
most skilful shot would have no advantage over a blind man. That, however, is the tacit 
assertion of those who use the common rule in estimating the value of various discrepant 
observations, when they treat them all indiscriminately. In this way, therefore, the degree 
of probability of any given deviation could be determined to some extent a posteriori, 
since there is no doubt that, for a large number of shots, the probability is proportional to 
the number of shots which hit a band situated at a given distance from the mark. 

Moreover, there is no doubt that the greatest deviation has its limits which are never 
exceeded and which indeed are narrowed by the experience and skill of the observer. Beyond 
these limits all probability is zero; from the limits towards the mark in the centre the 
probability increases and will be greatest at the mark itself. 

6. The foregoing give some idea of a scale of probabilities for all deviations, such as each 
observer should form for himself. It will not be absolutely exact, but it will suit the nature 








— SS * 


ire 








Studies in the history of probability and statistics 5 


of the inquiry well enough. The mark set up is, as it were, the centre of forces to which the 
observers are drawn; but these efforts are opposed by innumerable imperfections and other 
tiny hidden obstacles which may produce in the observations small chance errors. Some of 
these will be in the same direction and will be cumulative, others will cancel out, according 
as the observer is more or less lucky. From this it may be understood that there is some 
relation between the errors which occur and the actual true position of the centre of forces; 
for another position of the mark the outcome of chance would be estimated differently. 
So we arrive at the particular problem of determining the most probable position of the 
mark from a knowledge of the positions of some of the hits. It follows from what we have 
adduced that one should think above all of a scale (scala) between the various distances 
from the centre of forces and the corresponding probabilities. Vague as is the determination 
of this scale, it seems to be subject to various axioms which we have only to satisfy to be in 
a better case than if we suppose every deviation, whatever its magnitude, to occur with 
equal ease and therefore to have equal probability. Let us suppose a straight line in which 
there are disposed various points, which indicate of course the results of different obser- 
vations. Let there be marked on this line some intermediate point which is taken as the 
true position to be determined. Let perpendiculars expressing the probability appropriate 
to a given point be erected. If now a curve is drawn through the ends of the several per- 
pendiculars this will be the scale of the probabilities of which we are speaking. 

7. If this is accepted, I think the following assumptions about the scale of probabilities 
can hardly be denied. 

(a) Inasmuch as deviations from the true intermediate point are equally easy in both 
directions, the scale will have two perfectly similar and equal branches. 

(b) Observations will certainly be more numerous and indeed more probable near to the 
centre of forces; at the same time they will be less numerous in proportion to their distance 
from that centre. The scale therefore on both sides approaches the straight line on which we 
supposed the observed points to be placed. 

(c) The degree of probability will be greatest in the middle where we suppose the centre 
of forces to be located, and the tangent to the scale for this point will be parallel to the 
aforesaid straight line. 

(d) If itis true, as I suppose, that even the least-favoured observations have their limits, 
best fixed by the observer himself, it follows that the scale, if correctly arranged, will meet 
the line of the observations at the limits themselves. For at both extremes all probability 
vanishes and a greater error is impossible. 

(e) Finally, the maximum deviations on either side are reckoned to be a sort of boundary 
between what can happen and what cannot. The last part, therefore, of the scale, on either 
side, should approach steeply the line on which the observations are sited, and the tangents 
at the extreme points will be almost perpendicular to that line. The scale itself will thus 
indicate that it is scarcely possible to pass beyond the supposed limits. Not that this 
condition should be applied in all its rigour if, that is, one does not fix the limits of error 
over-dogmatically. 

8. If we now construct a semi-ellipse of any parameter on the line representing the whole 
field of possible deviations as its axis, this will certainly satisfy the foregoing conditions 
quite well. The parameter of the ellipse is arbitrary, since we are concerned only with the 
proportion between the probabilities of any given deviation. However elongated or com- 
pressed the ellipse may be, provided it is constructed on the same axis, it will perform the 








6 M. G. KENDALL 


same function; which shows that we have no reason to be anxious about an accurate 
description of the scale. In fact we can even use a circle, not because it is proved to be the 
true scale by mathematical reasoning, but because it is nearer the truth than an infinite 
straight line parallel to the axis, which supposes that the several observations are of equal 
weight and probability, however distant from the true position. This circular scale also 
lends itself best to numerical calculations; meanwhile it is worth observing in advance that 
both hypotheses come to the same whenever the several observations are considered to be 
infinitely small. They also agree if the radius of the auxiliary circle is supposed to be in- 
finitely large, as if no limits were set to the deviations. Thus if the deviation of an obser- 
vation from the true position is thought of as the sine of a circular arc, the probability of 
that observation will be the cosine of the same arc. Let the auxiliary semicircle, which I 
have just described, be called the controlling semicircle (moderator). Where the centre of this 
semicircle is located, the true position, which fits the observations best, is to be fixed. 
Admittedly our hypothesis is, to some extent, precarious, but it is certainly to be preferred 
to the common one, and will not be hazardous to those who understand it, since the result 
that they will arrive at will always have a higher probability than if they had adhered to 
the common method. When by the nature of the case a certain decision cannot be reached, 
there is no other course than to prefer the more.probable to the less probable. 

9. I will illustrate this line of argument by a trivial example. The particular problem is 
the reconciliation of discrepant observations; it is therefore a question of difference of 
observations. Now if a dice-thrower makes three throws with one die so that the second 
exceeds the first by one and the third exceeds the second by two, the throws may arise in 
three ways, viz. 1, 2, 4 or 2,3, 5 or 3, 4,6. None of these throws is,to be preferred to the other 
two, for each is in itself equally probable. If you prefer the one in the middle, viz. 2, 3, 5, the 
preference is illogical. The same sort of thing happens if you choose to consider observations 
which, so far as you are concerned, are accidental, whether they are astronomical or of 
some other kind, as equally probable. Now suppose the thrower produces the same result 
by throwing a pair of dice three times. There will then be eight different ways in which he 
would obtain this result, viz. 2,3, 5; 3, 4, 6; 4,5, 7; 5, 6, 8; 6, 7,9; 7, 8, 10; 8, 9, lland 9, 10, 12. 
But they are far from being all equally probable. It is well known that the respective 
probabilities are proportional to the numbers 8, 30, 72, 100, 120, 80, 40 and 12. From this 
known scale I have better right to conclude that the fifth set has happened than that any 
other has, because it has the highest probability; and so the three throws of a pair of dice 
will have been 6, 7 and 9. No-one, however, will deny that the first set 2, 3 and 5 might 
possibly have happened, even though it has only a fifteenth part of the probability corre- 
sponding to the fifth set. Forced to choose, I simply choose what is most probable. Although 
this example does not quite square with our argument, it makes clear what contribution the 
investigation of probabilities can make to the determination of cases. Now I will come more 
to grips with the actual problem. 

10. First of all, I would have every observer ponder thoroughly in his own mind and judge 
what is the greatest error which he is morally certain (though he should call down the 
wrath of heaven) he will never exceed however often he repeats the observation. He must 
be his own judge of his dexterity and not err on the side of severity or indulgence. Not that 
it matters very much whether the judgement he passes in this matter is fitting or somewhat 
flighty. Then let him make the radius of the controlling circle equal to the aforementioned 
greatest error; let this radius be r and hence the width of the whole doubtful field = 2r. 





on a an 6. 





Studies in the history of probability and statistics 7 


If you desire a rule on this matter common to all observers, I recommend you to suit your 
judgement to the actual observations that you have made: if you double the distance 
between the two extreme observations, you can use it, I think, safely enough as the diameter 
of the controlling circle, or, what comes to the same thing, if you make the radius equal to 
the difference between the two extreme observations. Indeed, it will be sufficient to increase 
this difference by half to form the diameter of the circle if several observations have been 
made; my own practice is to double it for three or four observations, and to increase it by 
half for more. Lest this uncertainty offend any one, it is as well to note that if we were to 
make our controlling semicircle infinite we should then coincide with the generally accepted 
rule of the arithmetical mean; but if we were to diminish the circle as much as possible 
without contradiction, we should obtain the mean between the two extreme observations, 
which as a rule for several observations I have found to be less often wrong than I thought 
before I investigated the matter. 

11. After all these preliminaries it remains to determine the position of the controlling 
circle, since it is at the centre of this circle that the several observations should be deemed 
to be, as it were, concentrated. The aforesaid position is deduced from the fact that the whole 
complex of observations would occur more easily, and therefore more probably, for this 
location than for any other position of the circle. We shall have the true degree of probability 
for the whole complex of observations if we note the probability corresponding to the 
several observations that have been carried out and multiply all the probabilities by each 
other, just as we did in § 9. Then the product of the multiplication is to be differentiated and 
the differential put = 0. In this way we shall obtain an equation whose root will give the 
distance of the centre from any given point. 

Put the radius of the controlling circle = r; the smallest observation = A; the second 
A +a; the third A + 6; the fourth A +c, and so on; the distance of the centre of the controlling 
semicircle from the smallest observation = x, so that A + will denote the quantity which 
is most probably to be assumed on the basis of all the observations. By our hypothesis 
the probability for the first observation alone is to be expressed by ,/{r?—2*}; for the 
second observation by ./{r?—(2—a)*}; for the third by ,/{r?—(a—b)?}; for the fourth by 
a/{r? — (—c)®} and so on. Then I would have the several probabilities multiplied together 
according to the rules of the theory of probability, which gives 


fr? — 2} x 4/fr? — (x — a)?} x 4/{r? — (a —b)?} x /fr? — (w—)?} x .... 


Finally, if the differential of this product is put = 0, the equation, by virtue of our hypo- 
theses, gives the required value x as having the highest probability. As, however, the afore- 
said quantity is to be brought to its maximum value, it is obvious that its square will 
simultaneously be brought to the same state. So we can use, for ease of calculation, a 
formula which is composed entirely of rational terms, viz. 


(r? — x?) x {r? — (w—a)?} x {r? — (w—b)?} x {r? — (x —c)"} x... 


and the differential is once more put = 0. For the rest, as many factors are to be taken as 
there were observations. 

12. Ifa-single observation was made, we must accept the observation as true. Now this 
is shown by our hypothesis. If only the first factor r? — x? is taken, we shall have — 2adzx = 0 
or x = 0and consequently A +z = A. Soin this case our hypothesis agrees with the common 
one. 








8 M. G. KENDALL 
If two observations have been made, A and A +a, two factors are to be taken, namely 
{r? — a7} x {r?—(x—a)*} or 14 — 2r2a? + at + 2ar2x — ar? + 2a28 x a®2?, 
the differential of which 
= —4rxdx+ 4a5dx + 2ar*dx—6ax*dx+2a8xdx=0 or 2a3—3ax?—-2rx7+a%x+ar? = 0. 


The only useful root which this equation gives is x = ja, and A+a = A+ 4a. This also is 
the teaching of the common hypothesis. This agreement holds whatever be the radius of 
the controlling circle, a fact which shows clearly enough, in the case of several observations, 
that the size of our controlling circle in an enterprise of this sort need not be strictly exact, 
and one should not expect it to be. What is awkward—and I do not conceal it—is that for 
several observations a very long calculation is required, and so I hardly dare propose more 
than general discussions of these cases. Let me at least expound the theory of three observa- 
tions, which is of the highest importance. 

13. When we have three observations to deal with, viz. 4; 4+a and A+), we shall 


have three factors 
{r? — x} x {r? — (x —a)*} x {r? — (x —b)}, 


for which we have to find the maximum value. If now these factors are actually multiplied 
together we shall obtain 


78 + 2artx — 3rtx? — 4ar®x? + 3r2at + 2ax5—2® 
— a? — Qab*r2x + 2b7r2x? + 2ab2x3 — b2a4t + 2bx5 
— b*r4 + 2brtx — a*b?x? — 4br2x3 — 4abat 
+ a*b?r? — 2a*br2x + 4abr2x? + 2a*ba3 — a®a* 
+ 2a*r?z?, 
If this expression is differentiated, and then after division by dz is put = 0 to obtain the 
maximum value, the following general equation for any three observations whatsoever will 
result 
2ar* — 6r4x — 12ar?a? + 12r223 + 10aa4 — 625 
— 2ab*r? + 4b?r?x + 6ab*x? — 4b7x3 + 10ba4 
+ 2br* — 2a*b*x — 12br?a? — 16aba3 
— 2a*br? + 8abr2x + 6a*bx? — 4a723 
+ 4a?r?x = 0. 
The root of this equation, which is indeed of the fifth degree and consists of twenty terms, 
gives the distance of the centre of the controlling circle from the first observation, and the 


quantity A+. gives the value which is most probably to be deduced from the three obser- 
vations which have been made. 

14. Unless the force of our fundamental arguments has been most attentively weighed 
there will be few perhaps who will see any relation whatever between the enormous equation 
and what seems to be a very simple question; for the common answer isx = (a+b). Never- 
theless, our equation corresponds well enough to notions which crop up elsewhere, some of 
which I will now expound. 

(a) Ifthe radius of the controlling circle is supposed to be infinite compared with a and b, 
all terms are to be rejected except those in which r rises to the highest power, in which case 





ee a ee — ae 


w 





Studies in the history of probability and statistics 9 


our equation is reduced to this very simple one 2ar4+ 2br4—6r4z = 0 or x = 4(a+b). So 
the common rule is contained in our equation. If, however, our definition set out in § 10 
is considered, it will be obvious how unfitting is the hypothesis of an infinite radius and how 
manifestly some more suitable one could be substituted for it. 

(b) If we put b = 2a, it is obvious that x = a whatever value is given to the radius r, 
and that too will be common to both theories. Let us see therefore what our equation shows 
for this case. Substituting for b the equation becomes 

6ar4 — 6r4x — 36ar2x? + 12r2a3 + 30ax4 — 625 
— 12a*r? + 36ar?x + 36a5x? — 52a?28 
—8atx = 0. 
Now this equation, whatever be the value of r, is satisfied by x = a, which the nature of the 
case demands. 

(c) Ifb = —a, x must equal 0 whatever be the value of r. This too is beautifully shown by 
our equation, which now becomes 

— 6r4xz + 127223 — 62° 
— 2a4x + 8a%23 = 0. 
A glance will show that the useful root is x = 0. 

15. This and other similar corollaries sufficiently confirm the real connexion of our 
fundamental arguments with the question under discussion, however enormous the equa- 
tion we have found may seem in so simple an inquiry. I proceed to examples in which the 
radius of the controlling circle is neither infinite nor indifferent, which is where practically 
all cases belong. In these examples our new theory always produces a different result from 
the common one; and the more the intermediate observation approaches either extreme, 
the greater the difference. It is on the discussion of these cases that the matter hinges, so 
we must have recourse to purely numerical examples. 


Example 1. Let us assume three observations 
A; A+02 and A+l, 
so that a=02 and b=1 
and let the value to be assumed as most likely from these three observations be A + x. The 


common rule gives x = 0-4. Let us see the new one which to my mind is more probable, 
and let us put r = 1 (cf. § 10). The following purely numerical equation results 


1-92 — 0-32x — 12-96x? + 4-6423 + 1224 — 62° = 0. 


the solution of which is approximately x = 0-4427, which exceeds the commonly accepted 
value by more than a tenth. This marked excess is due to the fact that the middle observa- 
tion is much nearer to the first than to the third. From this it is easily deduced that the 
excess will be changed to a defect if the middle obser vation is nearer to the third than to the 
first, and that the nearer the middle observation is to the mean between the two extreme 
observations, the smaller will be this defect. To test this conjecture I retain the other values 
and change only the middle observation, as follows. 


Example 2. Let a now = 0-56, and as before r = 6 = 1. By the commonly accepted rule 
we shall have x = 0-52. Let us see what happens with ours. The equation of § 13 gives the 
following numerical equation 


1-3728 + 3°1072a — 13-4784a? — 2-214423 + 15-624— 62° = 0 





10 M. G. KENDALL 


which is approximately satisfied by 2 = 0-5128. In accordance with our principles, the 
value of x is less than the arithmetical mean which is usually accepted, but the difference 
between the two is now quite small, viz. 0-0072, exactly as I had anticipated would be the 
case. Hence it can also be seen that the greatest difference between the two estimates occurs 
when it so happens that two observations exactly coincide and only the third diverges. 
There are two cases, viz. when a = 0 and when a = Bb. I will expound the result in each case. 


Example 3. Put a = 0, leaving the remaining denominations unaltered. Dividing by 
2b — 2x we have the following numerical equation 


1 — 62? — 223 + 324 = 0, 


which is approximately satisfied by = 0-3977, whereas the value of x obtained from the 
common rule is x = 0-3333. The former exceeds the latter by 0-0644. If, however, we put 
a = b and divide by 2z, the following equation results 


4— 62 — 62? — 10x? — 324 = 0. 


This is approximately satisfied by x = 0-6022, while the common value is 0-6666. So the 
difference between the two is once more 0-0644, but this time our 2ew value is less than the 
common one, whereas previously it was greater. It is clear from this that our method takes 
better aim at a certain intermediate point than does the common method. Evidence of this 
sort does much to commend the method that I propose, and I will go a little more closely 
into this consideration, if so be that an argumentum ad hominem may be accepted in a matter 
which does not admit of mathematical demonstration. 

16. If we combine the two cases in example 3, and suppose that six observations have 
been made, viz. A, A, A+band A+b, A +5, A, itis obvious that three observations support 
the value A and the same number the value A +b. We see by §12 that in this case both 
methods give the required mean value as A + 4), or for example 3, A +0-5; or, omitting 
the constant quantity A, simply 0-5. This value, derived from the six observations combined, 
will not be doubted by anyone. Now let us divide these six observations into two other 
triads, namely A, A, A+1 and A+1, A+1, A. In this case, rejecting once more the quan- 
tity A, the commonly accepted rule gives for the first triad 0-3 and for the second 0-6, both 
differing, the first by defect and the second by excess, by 0-16 from the mean 0-5. So for 
either triad of observations taken separately the common theory involves an error of 0-16, 
while ours involves an error of 0-1022, which is notably smaller. A great deal more evidence 
of this kind could be adduced to give further support to our fundamental argument; but 
I am afraid I should appear immoderate if I went on extending something which cannot be 
settled with certainty and absolute perfection. We have no higher aim than to be able to 
distinguish what is more probable from what is less. 

17. Such further perfection as we may reasonably expect will consist in a stricter and more 
accurate determination of the controlling scale and its width. I will add a few further com- 
ments on this topic. It is obvious from the foregoing considerations that our estimates are 
not so very different from the commonly accepted rule: so it is a question of a certain correc- 
tion which this rule appears to allow. This correction is provided by the actual divergences 
of the observations from the required true point, since they can be so arranged, for any given 
width of the controlling scale, as to make the most probable fit wigh this point. But for my 
part I cansee no way of strictly determining the width of the aforesaid scale except that which 
I mentioned in § 10. If an observer, through undue mistrust of his own powers, enlarges the 








eo et m& 


— 
~ 


~~ =e AF OD ® © tt CO d 


noe ae) cae cht a es ee 








Studies in the history of probability and statistics 1] 


dimensions of the controlling semicircle excessively, it will not give all the help it might, 
but what it gives will be more certain; if on the other hand he contracts the scale unduly, 
other things being equal he will arrive at a correction which is a little greater and somewhat 
less probable. Prudence seems to be as necessary here as sharp-sightedness. Should you 
wish to use the observations that have actually been made as a basis for an a posteriori 
estimate of the width of the controlling scale to be applied, it will be prudent to weigh in 
your own mind whether one should consider the observations to have turned out luckily 
or not. The more you assign to good luck, the less you can attribute to the skill applied in 
observing, and the larger accordingly will be the controlling circle which you will apply. In 
§ 13 I assumed r = 5; in other words the radius of the controlling circle equalled the distance 
between the two extreme observations. I admit, however, on better reflexion that this size 
of radius seems to me to argue somewhat excessive confidence; it would be safer certainly 
in future to put r = $b or even r = 20. Ifso, the correction would come out notably smaller 
but all the more certain and trustworthy. 

18. If there is any validity in our principles, though they are metaphysical rather than 
mathematical, we may justly conclude therefrom that one should seldom if ever reject an 
observation, and never without the utmost circumspection. I have already given my 
opinion on this subject in § 2. The whole complex of observations is simply a chance event 
modified and confined within certain limits by the skill of the observer. It may well happen, 
though very rarely, that of three observations two are miraculously identical, while the 
third by ill luck is very wide of the other two. But if this happens to me and I am certain 
that I have not unduly contracted the limits of maximum possible error or shown undue 
confidence in my skill, I should not hesitate to refer the examination of the whole case to 
our principles and form my estimate from them. Only the observer must give the same 
attention to each of the observations. I should like them all treated equally. 

19. The only remaining caution refers to the controlling scale which I have applied. 
We have taken a semicircle as answering sufficiently the conditions set out in §7 and at the 
same time most suited to the calculations that have to be carried out. Meanwhile it is 
worthy of note that there are other infinite curves which undoubtedly lead to the same 
equation as I set out in $13. In §11 we made the probabilities, for a circular scale, propor- 
tional respectively to the perpendiculars 


V(r?—a?); J {r?-(x—a)}; fr? —(x—b)}. 
Now if instead of a semicircle we suppose a parabola (arcum parabolicum) constructed on 
the line 2r, with its axis passing perpendicularly through the middle of this line, then 


keeping the same notation, we shall have perpendiculars, or the corresponding probabilities 
expressed by them, 


5 (r? — 2), ff —(a—x)*}, A ie —(b-—2)*}, etce., 

where the new letter p denotes the longest perpendicular at the abscissa x = 0. Now since 
the factor p/r? is common to all the terms we can simply substitute unity for this factor when 
we have brought the product of all the several probabilities to a maximum. It follows 
from this that the parameter of the parabola is always arbitrary. I also pointed out in the 
aforementioned §11 that if this product has been brought to its maximum all its powers 
will at the same time be maximized or minimized. It is obvious from this that both scales, 
the parabolic and the circular, lead to the same required value of x. Furthermore, it is 





12 M. G. KENDALL 


evident that innumerable other scales fulfil the same function; they will all have this 
property, that from their peak they approach in either direction the line 2r, on which the 
several observations are necessarily supposed to lie, and intersect it. Therefore all scales 
of this sort achieve our aim, and we need not be too pedantic in this matter, since we are 
content to strive for something better if not for the best. 

20. Finally, as regards the awkward, not to say monstrous, form of our fundamental 
equation set out in §13, we can mend the awkwardness somewhat; for I express the useful 
root as approximately 


Rit wines 


a. a+b 2a?—3a%b—3ab?+ 263 


The first term is none other than the common arithmetical mean for three observations, the 
second indicates approximately the further correction required by our principles. This 
root indeed will agree all the more accurately with the equation of § 13, the greater is assumed 
to be the width of the controlling scale indicated by 2r. Far be it from us, however, to increase 
the value of the letter r unnecessarily merely to make calculation easier, for every useless 
increase takes away a little from the amount of our correction. Nor would it be less dan- 
gerous to attribute too much to one’s powers of observation and so shorten the radius r 
unjustifiably. ‘There are fixed bounds, outside of which justice cannot exist’: cf. § 10. 
Our principles themselves show that it is impossible for r to be less than 30, since this in- 
volves the manifest contradiction of positing as impossible something which is supposed 
to have actually happened. I have not concealed, however, the somewhat free assumptions 
that have been made in the course of our argument; but I should not have thought that all 
our methods of judging the observations that have been made ought to be rejected on that 
account. Of this at least I am convinced, that the common rule for three observations gives 
somewhat too small a result when a < 4b and too large a result if a > 4b, and cannot ever be 
applied with greater certainty than when the intermediate observation is approximately equt- 
distant from the two extremes. Secondly, I think it probable that our equation in § 13 gives 
a safer and better determination of the position to be selected, provided the radius of the 
controlling circle is not rashly diminished beyond the limits which the powers of the 
observer permit: cf. §17. The question that I have dealt with is properly this: given three 
or more shots of an arrow marked on a straight line, to determine the most probable position 
of the point at which the archer was aiming. But any and every observer who understands 
these things will form for himself criteria which will answer his purpose, according to the 
nature of the material (argumento) which he has to hand, provided he makes cautious use 
of the rules derived from the theory of combinations. 


Recapitulation. By its very nature our problem is indeterminate, inasmuch as it depends 
on the practice, experience, and skill of the observer, on the precision of the instruments, 
on the keenness of the senses, in short on countless circumstances which may be more or 
less favourable. Account will be taken of all these things in assuming the width of the field 
of possible deviations; on this subject I have given my opinion, with all circumspection. 
Secondly, one has to examine the casual working of chance in favour of any given deviation 
(lit. the working of the casual chance which favours any deviation), since it is advantageous 
if any given deviation is assigned the probability which from the nature of the case fits it.} 
To be sure, this scale of probabilities remains in its turn uncertain and undetermined, should 


+ Reading cuiwis aberrationi for the cuiuis aberratione of the text. 








ant tnt ah oo oe 2 42 








Studies in the history of probability and statistics 13 


an accurate one be desired, but displays, nevertheless, by the very nature of the case, several 
properties; and if these are satisfied, it may be considered to be sufficiently known, as I 
learned from several experiments. So a method comes to light of expressing in accordance 
with the proven precepts of the theory of probability the absolute probability appropriate 
to any given system of observations for any assumed location of that system. It only remains 
then to select that location of the system in question which enjoys the highest probability. 
It certainly seems extraordinary to me.that the algebraic equation defining this location, 
which is so far-fetched and rises to the fifth degree for only three observations, which is 
expressed in a very large number of terms and is deduced from principles never used before, 
nevertheless, from whatever point it is examined, gives rise to nothing which is in the least 
displeasing, still less leads to any absurd result. The upshot of the calculations in any example 
is little different from that which is indicated by the common method, provided one does 
not recklessly jump at the precepts which I have laid down. Where the comparison of three 
given observations shows that the middle one is approximately equidistant from the 
extremes, we shall adhere without hesitation to the common rule; but if the two intervals 
are notably unequal, I think it is better to have recourse to our theory, provided one follows 
the precepts I have set out and exercises the greatest prudence in fixing just bounds to the 
field of possible deviations. All this I should wish to have weighed in the balance of meta- 
physics rather than mathematics. Those who are most shocked by our principles will have 
nothing further to contradict if only they make the field of possible deviations as large as 
possible. 


Observations on the foregoing dissertation of Bernoulli 
By L. EULER 


1. The question which our distinguished friend Bernoulli handles here is one of no little 
moment, namely, how an unknown quantity should be derived from several observations 
which vary slightly from each other. To make the nature of the question easier to discern 
clearly, let us suppose that the elevation of the pole star at some place or other has to be 
discovered and that the observations made to this end have the following different values: 

Ii+a, +6, +c, I+d, ete., 
where the letters a, b, c, d, etc., are taken to be expressed in seconds. From these the true 
elevation of the pole at this place, Il +2, is to be deduced. Generally this quantity x is 
obtained by taking the arithmetic mean of all the quantities a, b, c, d, etc. Hence if the 
number of observations = n, x = (a+b+c+d+ete.)/n. 

2. In this rule it is obviously assumed that all observations are of the same degree of 
goodness. For if some were more exact than others, account ought to be taken of this 
distinction in the computation. Now although there is no apparent reason in the circum- 
stances why one of these observations should be accorded a greater value than the rest, 
nevertheless, the learned author observes that these observations ought to be awarded a 
higher degree of goodness the nearer they approach to the truth, just as that class of ob- 
servations which is thought to depart too far from the truth is usually completely rejected. 
The whole business therefore amounts to this: to show how the degree of goodness appro- 
priate to the several observations is to be estimated. 





14 M. G. KenDALL 


3. According to the view of the distinguished author, it will be convenient to consider 
the deviation of each observation from the truth as already known. This will be x —a for the 
first observation, x —b for the second, x —c for the third, etc., but the defect of each obser- 
vation should be estimated not so much from these differences as from their squares, since 
the defect itself is to be reckoned as the same whether the observation errs by excess or 
defect. Hence if some observation agrees perfectly with the truth, its defect will be zero. 
If therefore the degree of goodness of this observation is indicated by r?, it is obvious that 
the degree of goodness of the first observation must be indicated by r?—(x—a)?*, that of 
the second by r?—(x—b)*, of the third by r?—(a2—c)? and so on, the value of r being such 
that for an observation which is to be all but rejected the degree of goodness vanishes. If 
we assume that this happens in the observation which gives II + w, then since the degree of 
goodness of this would be r? — (~—)*, it must be laid down in all cases that r? = (~—1)?. 

4. Having established these conclusions concerning the degree of goodness of each 
observation, the distinguished author appeals to the following principle, for which indeed 
he gives no reason: that the product of all the formulae expressing the degrees of goodness 
of the several observations should be allotted a maximum value. On this principle therefore 
he bids one differentiate this product and equate the differential with nought, since this 
equation will then give the true value of x. This he illustrates with some examples based 
on sets of three observations, deriving therefrom values of x which seem to be quite in 
conformity with the truth. 

5. This principle for only three observations led to an equation of the fifth degree, whose 
root x had to be found; and anyone who wished to apply the principle to four observations 
would arrive at an equation of the seventh degree. Five observations would lead to one of 
the ninth degree and so on. It is thus abundantly evident that this method cannot possibly 
be used where there are several observations, and this is in fact candidly conceded by the 
distinguished author, who presents the whole dissertation as a purely metaphysical 
speculation. 

6. As, however, the distinguished author has not supported this prinu. -le of the maxi- 
mum by any proof, he will not take it amiss if I propound certain doubts about it. If we 
assume that among the observations in question there is one that should be almost rejected, 
whose degree of goodness would accordingly be as small as possible, it is evident that the 
product of all the formulae mentioned would in fact be reduced to nothing, so that it could 
not possibly be considered as a maximum, no matter how great it might be, were that 
observation omitted. Now the principles of the theory of probability make it abundantly 
clear that the value of the unknown quantity 2 should come out the same whether an 
observation such as this, which has no goodness at all, is introduced into the calculation or 
totally rejected. 

7. Ido not think that it is necessary in this question to have recourse to the principle of 
the maximum, since the undoubted precepts of the theory of probability are quite sufficient 
to resolve all questions of this kind. If the first observation, which gave II +a, is assigned 
the amount or degree of goodness (pretium seu gradum bonitatis) a, the second £, the third y, 
the unknown quantity x is given by the rules of this theory thus: 


Hence a(x —a)+ B(x —b) + y(x—c)+d(a—d)+ ete. = 0. 











> a a eg 





is 





——— ——- _-_ 


Studies in the history of probability and statistics 15 


Now it is clear that if all the grades of goodness were equal and the number of observations 
were n, we should have x = (a+b+c+d+etc.)/n, as required by the common rule. From 
which it follows that different values may emerge for the unknown ane x to the extent 
that the degrees of goodness differ. 
8. Since therefore, as the distinguished author himself states, the grades of goodness 
indicated by the letters a, £, y, 6 are 
a= ‘ee B =r? —(x—b)?, 


y=r—(x—-c)*, d=r?-(x-d)*, etce., 
the equation we have found becomes 


r?(x—a)+9r?(a—b) + 1r?(x—c) 
—(x—a)®—(x—b)—(x—c)® ete. = 0. 


B<nce if the number of observations = n and we put for brevity’s sake 


a+b+c+d+etc. = A, 
a’? +b?+c?+d*+etec. = B, 
a+b? +3 +d3+ete. = C, 
that equation is reduced to the following fairly simple form 
nr?a — Ar? —na + 3Ax?—-3Bx+C = 0. 


Thus we arrive at a cubic equation, from which the unknown ¢ can easily be found, whatever 
the number of observations n. 

9. If we regard the quantity r as infinite, which is the case when all the observations are 
assigned the same degree of goodness, then we may neglect all the other terms and directly 
deduce from this equation the following 


A _ at+b+c+d+ete. 


n n 


just as is required by the rule which is commonly adopted. If we designate this value by the 
letter p, and substitute II for II + p in the observations themselves, we shall have to diminish 
the several numbers a, b, c, d, etc., by the same quantity p, and thus the sum of them all, 
for which we put A, will equal 0. To avoid, however, the introduction of new letters into the 
calculation at this point we can from the beginning so constitute the quantity II that if 
the values of the several observations are given as I1 +a, I1+0, Il+c, I1+d, etc., the sum 
of the letters a+b+c+d+etc. = 0. Then to discover the quantity x we shall have the 
following much simpler equation 


ne —nrx+3Bu—C = 0, 


from which would follow, if r were infinite, x = 0. It is clear from this that if this equation 
has several real roots, the smallest should be taken as x, so that the required true value 
will be II +z. 

10. This same question can, however, be referred even to a quadratic equation by intro- 
ducing the sort of observation which after weighing all the circumstances we decide should 
be totally rejected. Let such an observation be I+, and since ex hypothesi its degree of 





16 M. G. KENDALL 


goodness r? — (x— 1)? = 0, r? = (x—wu)*. The introduction of this value in the last equation 
that we found produces the following form 
2nua* —nu*x+3Be—C = 0. 
It will be convenient to regard the term — nu*z in this equation as the greatest, so that the 
equation can be expressed as follows 
x(nu? —3B-—2nux) = —C 

— ee a 
nu?—3B -2nux’ 
where by substituting for x the value just obtained we get the following continued fraction 
eel 2, CO 2, 
nu?—3B+ nu?—3B+ nu?—3B+’ 
a form which will soon give the true value of z itself. 

11. Since the distinguished author has founded his solution on the principle of the 
maximum, it will not now be difficult to produce an analytical formula of this sort which, 


when made equal to its maximum, yields the true value of x. Let us use for this purpose 
the form first discovered 


from which follows “= 


t= etc., 


r?(a—a)+r?(~—b)+7r2(x—c) + ete. 
— (x—a)® —(~—b)? — (x—c)® etc. = 0, 
which may be regarded as the differential of some formula which is to be raised to its 


maximum. This formula itself will emerge, if this expression is put in the form of a differential 
and integrated. Multiplying by 4dx and integrating we obtain 


2r?(x — a)? + 2r?(a — b)? + 2r?(a —c)? + ete. 
— (x—a)* — (~—b)*—- (x—c)* —etc. + constant. 


If we assume —nr* as the constant, there being observations, by change of sign the 
following formula results 


{r? — (x —a)?}? + {r? — (x — b)?}? + {r? — (x —c)*}? + et. 

12. In place therefore of the formula which our distinguished friend Bernoulli thought 
should be made equal to its maximum we have now arrived at another formula very well 
suited to the nature of the question, which when brought to its maximum gives the true 
value of x, since this formula is obtained by adding together the squares of all the degrees 
of goodness. 

13. To furnish an example of our method, let us consider the observations by which the 
longitude of the observatory of St Petersburg is deduced from the difference between the 
meridians of the observatories of Paris and St Petersburg. These are reported as follows: 


I 1°51'50" IV 1°51’50" 
II 1°51'52” V 1°51’ 50” 
III 1°51'39” VI 1°51'50” 


Taking the arithmetic mean of these in the usual way we obtain 1°51’ 484”. 
14. Now let us apply our formulae to this case, taking Il = 1°51’ 48}”. The values of 
our six letters a, b, c, d, e, f will be 


a=1}, b=3}, c=-9}, d=1}, e=1}, f=11. 





| 





th 


Oa, et Se Oo he 








Studies in the history of probability and statistics 17 
Their sum A = 0; the sum of the squares B is found to be }(223); the sum of the cubes 
= —801. Hence our equation for n = 6 will be 
12ua* — 6u?x + 801 
+ 33442 = 0. 

15. Now let us define the number u from a case which the author of the observations 
thinks should be rejected, such as 1°52’ 20", which gives wu = 314. Let us suppose that 
uw = 30, making our quadratic equation 

3602? — 506542 + 801 = 0 
instead of which we may write in round figures 


36a? = 5002 — 80. 


From this x= 250 +4 59,620 Q 
36 
ag 250 + 244 250-244 1 
th t th Ba - = 1 = —__—_—__- = -, 
at is either x 36 4 or 2 36 6 


The latter value only can be considered, and might have been obtained immediately by 
neglecting the first term in the equation: the value of x would then have been 52 or approxi- 
mately 4. The required difference of the meridians will therefore be 1° 51’ 483”. 
16. Again, suppose that observation had been rejected which gave 1°51’0”: we then 
have u = — 484. Let us take wu = — 48, giving the equation 
—576x2 — 13,48942 +801 = 0. 


Neglecting the first term we obtain x = ;$; = ;!;. Now since this observation would have 
deserved to be rejected, if w had been in the neighbourhood of — 300, hence, carrying out 
the calculation as before, x would have come out as about 4. It is clear from this that in 
this case we could have been content with the common rule, since not even a second’s 
difference is involved. 

17. Since, however, the third of these observations differs so much from the others, 
it will perhaps be convenient to set the limit not far from it. If we were to do this for the 
case 1°51’ 334", uw = — 15”, our equation would accordingly be 

— 180a? — 1000x+ 80 = 0, 
the smaller root of which equation will be +2 = 3. Hence the difference of the meridians 
would have come out as 1°51’ 491”. It is once more clear from this case that no notable 
error is to be feared, unless we make a quite monstrous mistake in assuming a value for w. 
In this matter it will suffice to note that nu? must always be much larger than 3B. 

18. In particular this method deserves to be applied to those observations from which 
the learned Lexell not long ago determined the parallax of the sun. From these we take, 
purely by way of example, the following four conclusions drawn from the observations, 
namely (I) 8-52; (II) 8-43; (III) 8-86; (IV) 8-28. Taking the arithmetic mean of these we get 
8-52. If therefore we put II = 8-52 the values of the four letters a, 6, c, d can be fixed as 
follows: a=1, b=9, c=-34, d= +424 


so that the sum comes out as A = 0.+ All these numbers of course denote hundredths of a 
second. The sum of the squares B = 1814, the sum of the cubes C = — 24,750. 


{ According to his original usage all these signs should be reversed. Presumably a has been rounded 
up to unity to make the sum of deviations zero. 


’ Biom, 48 





18 M. G. KENDALL 


19. If we now assume as the term where the degree of goodness vanishes u = 40, our 
equation emerges as 
320x? — 9482 + 24,750 = 0. 


From this the value of x itself comes out as imaginary. Let us accordingly assume u = 50; 
the equation will then become 


4002? — 10,0002 + 24,750 
+ 54427 = 0 


and we still arrive at an imaginary result. If, however, we take u = 60 the smaller value 
of x will be 3,3;, which might seem to be too large. If we admit it the parallax of the sun 
would be 8-555. But let us note that larger values of u give smaller values for x. Since the 
application of this method is so vague, we may well doubt whether in this fashion we can 
arrive any closer to the truth, and perhaps it will suffice to have learnt at any rate, whether 
the value of x will come out positive or negative. 

20. In this case, to be sure, we have seen that the value of x is certainly positive, since 
we have found a negative number for C. Hence we may profitably observe in general that 
whenever C comes out positive, « becomes negative, while if C is negative the value of x 
will be positive. In either case it must of necessity be so small that the result will hardly 
differ from the common rule. This at any rate can be added, that the larger the number C, 
the greater must necessarily be the value of x. For ifthe sum of the cubes C actually vanished, 
then x would always = 0, whatever value is accepted for wu, just as the common rule 
requires. 

21. Thus, notwithstanding the uncertainty produced by the number w, it seems that 
something reasonably probable can be laid down even if we cannot reach certainty, if we 
pay attention to the following points. First, it is certain that whenever the sum of the cubes 
C = 0, x will always = 9. Secondly, the larger the quantity C, the larger will be the value 
of x itself, with the opposite sign. Thirdly, it is clear enough that the quantity nu? must be 
very much greater than the quantity 3B. In view of this we can lay it down with reasonable 
probability that x = —C/AnB, where the number A, it is true, is left to our judgement. 
However, it will meet all cases and depart hardly at all from the truth, if we put A = 2 or 
at most A = 3. The resulting difference will usually be so unimportant that we hardly need 
consider it. For the case where the greatest error is to be feared would undoubtedly be that 
in which several observations, i in number, agree entirely in each giving the value a, while 
the one remaining observation gives —ia, so that the sum of them all A = 0. The sum of 
the squares B = ia*+i?a? = i(i+ 1)a?; the sum of the cubes = ia? —i8a3 = —i(i?—1) a3. 
For x = i+ 1 our formula gives 

i(i2—l)a (i—l)a 
o@ * seit aed) 
If therefore 7 is very large and we take A = 2 the result is x = 4a. In the earlier example 
where n = b, B = 11} and C = —801, x = + 801/(12 x 1114) = 3 approximately. In the 
second, where n = 4, B = 1814, C = — 24,750, x = 24,750/(8 x 1814) = 8 approximately. 
These values do not appear to involve anything absurd. 

If, however, anyone thinks that it would be more reasonable to take A = 3, I hardly 
think the difference is worth arguing about, since the very nature of the observations does 
not admit of a greater degrce of precision. 


~e 











° 


() 


r 


yxy a 


— imeon Oe nm Ul Oe Lae 





Biometrika (1961), 48, 1 and 2, p. 19 
Printed in Great Britain 


The variance of Spearman’s rho in normal samples 


By F. N. DAVID} anp C. L. MALLOWS{ 
University College London 


Kendall (1949) and David, Kendall & Stuart (1951), modified by Fieller, Hartley & 
Pearson (1957), first gave an approximation to the variance of Spearman’s rho, r,, in samples 
of n from a normal bivariate population in which the product moment correlation is p. This 
modified value is 


1 
03, = —— [1 — 1-563,465 p? + 0-304,743 pt + 0-155,286 p® 
+ 0-061,552 p® + 0-022,099 p™ + 0-019,785 p']. 


Recently we have had occasion to require a closer approximation to this variance and have 
obtained one which is exact for n and to order p'” for p. The leading terms of our expression 
agree with those of Kendall’s. Many of the actual quantities calculated to achieve our final 
result will be useful in other aspects of the rank problem. Accordingly we give these auxiliary 
quantities in Appendices so that they may be used for other purposes. 

Assume n pairs of observations {z;,y,;} randomly and independently drawn from a 
bivariate normal population in which the product moment correlation is p. Let X; be the 
rank of x; and Y; the rank of y;. Using the technique also applied by Moran (1948) for finding 
&(r,), we define a function H(t) such that 

H(t)=0 (t <0) 
=< (¢ >0 }. 


Accordingly we may write 


, X;-1 => H(x;—%;), Y,-1 = 2 A(Y;—Ym) 
j=1 m=1 
with s= a a a A (x;—2x;)H(y;—Ym); 
i=1 j=1 m= 
+H OF (8) 
r, = —_—“* |—_} 
. n n?—1 


Since &(s) is known the variance of 7, depends on the evaluation of &(s?). This involves 
expanding the square of the triple sum, s, taking expectations term by term and collecting 
together like terms. A list of the different terms involved and the numbers of each can 
easily be compiled using David & Kendall’s symmetric function tables (1949) or the new 
bivariate symmetric function tables recently compiled by Barton, David & Fix. Each term 
is the expectation of a product of four H’s. We require the probability that each of these 
products is positive, given that the {x,;y,} are randomly and independently drawn, which 
means simply the evaluation of the positive quadrant of the quadrivariate normal integral 
in each case. This is not as formidable as it sounds since many of the products have the same 
variance-covariance matrix and in the event only 23 evaluations were required. 
| + With the partial support of the Office of Naval Research while at the University of California, 


Berkeley. 
t Now with the Bell Telephone Company, Murray Hill, New Jersey. 














20 F. N. Davip ann C. L. MALLows 


Moran (1948) and Kendall (1941) have both given series expansions whereby the positive 
quadrant of the quadrivariate normal integral may be evaluated. These series are however, 
for p at all large, very slow to converge, and this makes them unsatisfactory for present 
purposes. Here we make use of a procedure put forward by Plackett (1954). Let the corre- 
lation matrix be 

1 Piz Piz Pia 


a 1 Peg Poa 
1 Psa 
1 


and its inverse 


C _ Cro Cog Coy 
C33 Cy, 
Cy; 

Write Coa 

, ~ MG, (Cop Coa) 
to form the matrix 
1 x 12 xX 13 Xx 14 
an 1 Xx 23 Xx 24 
a= s dy 
1 


Now each of the elements of R will be, for this particular problem, either a constant or p with 
a constant multiplier. Let K be the matrix R when pis put equal to zero, and L be the matrix 
when pis put equal to unity. Then, writing ® for the integral over the positive quadrant, 


L; d 
®(R) = tm Zl x va a io 008 (X om’ 


where p,q are the elements of the permutation 1, 2, 3, 4 which are not i, j and L;,; are those 
elements of Z which are different from those of K. This procedure follows directly from 
Plackett’s equations (6) or (7). It enables an expansion to be made about points other than 
fiz = 9, and in most cases gives a quickly convergent series. 


Example. Sundrum (1953) has given the distribution of Spearman’s rho for n = 3 by an 
argument which he does not set out. His results can be shown to follow easily from an 
application of the method of the previous section. Assume three sets of bivariate normal 
variables (x;y;), 4 = 1, 2,3, with correlation p. Let (x,y, 2Y3, ¥3y,) correspond to a ranking 
(12, 23, 31) so that the sample Spearman’s rho r,, is — 4. Then 


Pfr, = —4} = P{(x,—%p) (@2— 23) (Y2—Ys) (Ys— Yr) > OF. 
Writing X,=%-%, X,=%,—-% Y=y¥-Ys Ky=Y¥s—- 


the variance covariance matrix of X, X,Y, Y, and the correlation matrix is 


Soh. =p .-P : -+§ = oe 
2 2 -p ' - =. 
last and 1 B = R. 


2 





eo 











“So 





The variance of Spearman’s rho in normal samples 21 


The inverse is 











4 2 2p 4p 
3(1—p*) 3(1—p?) 3(1—p?) 3(1—p?) 
: wi ss 14 w Pp 
—p2 —p2 —yg2 
C= ee ee Pe) a ee 1 -1p tp 
4 2 1 3 
3(1—p*) 3(1—p?) 1 
4 
3(1—p?) 


The K and L matrices are 


1-3 0 0 << 
Ks . : and L= te ei! 
1 1 
whence @(K) = 
and 
OR) = 35+ zqa\-2 [ep co ed do— [” 45 cos —4p)dp 


= get gq [—Ssin 4p + sin p] +24 [(sin- 4p)? — (sin). 


The probability of the ranking (12, 23, 31) or of any permutation of the 3 pairs, is 69(R) which 
agrees with Sundrum. 

Essentially the same technique is used to find the probability of each of the products in 
Appendix 1. Each term consists of the product of four normal variables and the probability 
that this product is positive is required. There are fifty-eight of these products but there are 
many common correlation matrices and the work reduces to the evaluation of twenty-three 
such. Of these twenty-three probabilities some can be written down at sight, others may be 
evaluated exactly, while twelve of them involve the evaluation of complicated trigonometric 
functions. They are not elliptic integrals in the strict sense. The correlation matrices and the 
associated probability in the positive quadrant are set out in Appendix 2. It will be noticed 
that eleven integrals have no explicit solution. We have given series expansions for these 
integrals in Appendix 3 and writing 


S,=sin-p, S, = sin-'}p 


will use the notation A(S,), A(S,), etc., to indicate that the series is either in powers of 
sin-! p or sin~! 3p. 








22 F. N. Davip anp C. L. MatLtows 
Collecting the terms in Appendix 1 with the help of Appendices 2 and 3 we have 


2) _ M(n—-1) 9 2 ie 
&(s?) = 14a (9n* — 26n? + 29n — 8) +- She 2)+8,} 


1 y Y 
+ 4 {nS} + 2nS, S, + nS} +n (653 — 653)} 


+z : alll acs ) + C(S,) + J(S,) + $K(S2)} 


+ nina S,) + D(S,) — 2H(S,) + 21(S,) — 2E(S,) + 3F(S,) + B(S,)}}. 
Remembering that E(r,) = nn [(n — 2) S,+ S,] 


m(n +1) 
we have, after reduction 


r= AT wala 1y oot Hp" —2) Ont + 22) Sf 
+12(n—2)2S, S,—2(m — 3) 82] 
144(n— 2) (n—3) 
7n(n — 1) (n +1)? 


+ {2G(S,) + D(S,) — 2H(S,) + 21(S,) — 2E(S,) + 3F (S,) + B(S,)}}. 


[(n — 4) {2A(S,) + C(S,) + J(S:) + 3K (S,)} 


For purposes of computation this may be written as 


Var (rd) = 1 aay we pM 2) Ont 1 + 22) 8 
+ 12(n— 2)? S, S,—2(n — 3) 82] 
144(n 3 
rnlna iy ne elt 2HB Yh sig 
where 


a = 1:821,3672S2 + 0-477,0655 S4+ 0-439,3170S§ + 0-467 ,0222S$ 
+ 0°565,8339S}° + 0-759,3293S}?, 


f = 3-821,3672S82 + 2-682,0510S4 + 3-395,6439S8 + 6-662,084488 
+ 10°522,7533S} + 22-784,3043S}2, 


y = 0:744,0169S? — 0-044,4207S} — 0-002,1383S$ + 0-000, 1221S? 
+ 0-000,0407S1° + 0-000,0021S}?. 


The coefficients in these series are known exactly and may be calculated to as many decimal 
places as desired. 

When p = Oitisseen that (X) gives the value 1/(n — 1) for var (7,) as indeed it must. When 
|p| = 1 the value of var (r,) is not zero, as it should be, but — 0-003,320/(n — 1) a nonsensical 
value arising from the fact that the convergence of the series £ is too slow to permit any 
reasonable approximation. The range over which (X) will be useful will depend on both 
n and p as well as on the degree of accuracy required. Arithmetic investigations suggest that 
a suitable rule for n of reasonable size and for five figure accuracy in var (r,), |p| should not 








“oe 





Se 





The variance of Spearman’s rho in normal samp'es 23 
exceed 0-8. Specimen calculations for |p| = 0-5 and 0-6 will serve for illustration. Since all 
the terms are exact except for a, # and y we give these only. 
|p| = 05, 

a = 0:116,289 + 0-001,945- + 0-000, 114 + 0-000,008 + 0-000,001 + 0-000,000 = 0-118,357, 

f = 0-243,984 + 0-010,933 + 0-000,884 + 0-000,111 + 0-000,011 + 0-000,001 = 0-255,925, 

y = 0-203,977 — 0-003,339 — 0-000,044 + 0-000,001 = 0-200,594; 
lp] = 0-6, 

a = 0-169,091 + 0-004,112 + 0-000,352 + 0-000,035- + 0-000,004 + 0-000,001 = 0-173,594, 

fB = 0-354,767 + 0-023,116 + 0-002,717 + 0-000,494 + 0-000,073 + 0-000,015 = 0-381,182, 

y = 0-308,093 — 0-007,617 — 0-000,152 + 0-000,004 = 0-300,328. 
Thus, writing NV = n— 1 for conciseness, we have, for example, when |p| = 0-5, thealternative 
forms 





1 36 
eee tei eke , 3(()- 2(0- 9 . 2) — 0-045 
var(r,) = N41) NN 2 (0-101,198) + N2(0-303,626) + N(0-186,232) — 0-042,742] 
1 1 
se ea roe © 3((- 2(1- ” = * 5 
=F waren vaape (0-369,13) + N2(1-107,49) + N(0-679,34) — 0-155,90] 
_ 0°630,87 , 0-738,16 _1-417,10 | 2-812,64 (Y) 
oe ae ee 


(Y) would seem to be the appropriate formula for the calculation of var (r,; |p| = 0-5) for 
any N=n-1. 

We have given some thought to the possibility of alternate expansions of the integrals 
A, B, ete., when p is large. When |p| = 1, 8S, = 47, S, = 3m and 

1 36 
n—1 mn(n—1)(n+1)? 


var (7,) = 0 = [3(n — 2) (3n?— 15+ 22) don? 


+ (n— 2)? 7? —4(n—3)n*] 


es a[(n — 4) {2A (77) + C(§m) +S (hm) + 3K (377)} 


mn(n — 1) (n+ 1) 
+ {2G(4m) + D(4m) — 2H (hm) + 21 (hm) — 2B (Amr) + 3F (Ar) + B(Am)}]. 
ba in 2 sin-?4 e 
Let A*(S,) = | sin-! ( sm 2x ) ae | oun ( sin Qa ) av 
0 Ja ) 


+ 2 cos 2x) 0 J(1+ 2 cos 2a 


7. bn ae sin 2a de 
- sin~!4p (1+ 2 cos 22) 


with similar meanings for B, C, etc. Then 
= 144 

~ n(n —1)(n+1) 

+ {2G*(S,) + D*(S,) — 2H*(S,) + 21*(S,) — 2B*(S,) + 3F*(S,) + B*(S,)}] 

ES 

~ n(n —1) (n+ 1)? 

+ (n— 2)? (7? — 12 8, 8.) — 3(n — 3) (7? — 482)]. 


var (r,) 3 [(n — 4) {2A *(S,) + C*(S,) + J*(S,) + 4K *(S,)} 


[3(m — 2) (3n? — 15n + 22) (4577? — 83) 


Some simplification is possible, as indeed it also is in (X), by adding integrals together, but 
this only reduces the number of series to be evaluated by one or two. We expand about the 





24 F. N. Davip anp C. L. MAattows 


upper limit and obtain series for A*, B*, C*, H*, F*, H*, [*, J*, K* in terms of powers of 
(47—sin-1}p) and for G*, D* in powers of (47—sin-!p). The arithmetic involved in the 
algebraic expansions of this kind is, however, lengthy and we do not propose to finish this 
part of our investigation until the necessity for the variance of r, for p large becomes apparent. 

One further way of expressing var (r,) might be mentioned. The quantities A, B, C, etc., 
may be expressed as a power series in p instead of in S, and S,. If we do this we have 


36 


_ 1 
var (r,) = n—1 mn(n—1)(n+1)? 





[3(m — 2) (3n?— 15n + 22) 82 
+ 12(n—2)?S, 8S, — 2(n — 3) S32} 
144(n — 2) (n—3) 
-.— -455,342p? + 0-067,762p4 + 0-016,893p° + 0- 2238 
nlm =A) (we yt Or485,3420" + ,762p* + 0-016,893p° + 0-005,223p 
+ 0-001,839p" + 0-000,710p12) + (— 0-122,008p2 
+ 0-179,778p4 + 0-124,555 p* + 0-087,275p8 
+ 0-059,080p"? + 0-042,552p12)]. 
If S, and S, are also expanded as series in p then 
1 36 
= — 3( — 0-428,632 240. 46 440-04 6 
var (r,) rt + ialan ly) (m+) [n3(—0 ,632,79p? + 0-083,546,97p4 + 0-042,572,46p 
+ 0-016,874,74p8 + 0-006,640,71p° + 0-002,706,55p2) 
+n*(0-155,130,10p? — 0-0573,622,93p4 — 0-184,434,07p® 
— 0-022,717,32p8 + 0-007,575,24p" + 0-013,298,83p!2) 
+ n(0°368,372,59p? + 0-447,388,82p4 — 0-084,275,74% 
— 0-279,299,01p* — 0-199,433,75p1° — 0-138,610,60p12) 
+ (0-071,796,77p? + 0-064,671,62p4 + 0-210,152,57p% 
+ 0-285,897,98p8 + 0-317,044,25p" + 0-079,237,33p!2)]. 
(2) 
The terms in 1/n in this series agree with those already given by David et al. (1951), 
with the exception of the coefficient of p!°, which we find to be 0-0242,224 against their 
0-0220,992. 

In the course of this work we have investigated many methods which have been proposed 
for the evaluation of the positive quadrant of the quadrivariate normal integral but have 
found none of greater utility than that proposed by Plackett. It is clear that his given 
technique can be similarly modified for the evaluation of the six-dimensional normal 
integral, in this case two integrals being required for the evaluation of each set of concor- 
dances. We are at present investigating this process with the idea of obtaining the p.d.f. of r, 
for the case n = 4. Barbara Snow, who checked all the algebra in the present paper is hoping 
to apply the same method to obtain the third moment of Kendall’s ‘tau’ coefficient, the 
enumeration for which has been done by Sundrum (1953), and the third moment of Spear- 
man’s ‘rho’. This latter will, however, take some time to accomplish although it is not 
difficult in principle. 

We would like to acknowledge help given in checking algebra by Barbara Snow and 
Evelyn Fix. D. E. Barton contributed many ideas for the expansions in terms of 
(47 —sin-},) which we have not pursued to completion. 











cahma™e BARS 


secs grees i ieee a) 








Corre- No. 


The variance of Spearman’s rho in normal samples 25 


REFERENCES 
Davin, F. N. & Kenpatt, M. G. (1949). Tables of symmetric functions. Part I. Biometrika, 36, 
431-49. 


Davi, S. T., Kenpatx, M. G. & Stuart, A. (1951). Some questions of distribution in the theory of 
rank correlation. Biometrika, 38, 131-40. 


FIELLER, E. C., Harttey, H. O. & Pearson, E. 8. (1957). Tests for rank correlation coefficients. I. 
Biometrika, 44, 470-81. 


KEnDALL, M. G. (1941). Proof of relations connected with the tetrachoric series and its generalization. 
Biometrika, 32, 196-8. 


KENDALL, M. G. (1949). Rank and product moment correlation. Biometrika, 36, 177-93. 
Moray, P. A. P. (1948). Rank correlation and product moment correlation. Biometrika, 35, 203-6. 


PLacKETT, R. L. (1954). A reduction formula for normal multivariate integrals. Biometrika, 41, 
351-60. 


Sunprum, R. M. (1953). Moments of the rank correlation coefficient tau in the general case. Bio- 
metrika, 49, 409-20. 


APPENDIX 1 


[The numbers 1, 2, 3, 4, 5, 6, are used merely to differentiate the x’s and y’s.] 





Corre- No. 
lation of Par- lation of Par- 
matrix terms Type tition matrix terms Type tition 
a nm) (x, — 2X2) (Yr — Ys) (%_— Xs) (Ya—Yo) (18) : _— rs Bin, lb: ae 
Hy — Lg) (Yi — Ys) (Vg — Ve) (Ya - Yr 
b nm) (%— 2) (Y1 — Yo) (%3—%4) (Ys—Ys) (2714) a (&1 — Xq) (Yi — Ys) (%4 — Xs) (Ys — Yi) 
c (%, — %q) (Y1 — Ys) (%1 — V4) (Yi — Ys) o (1 — Xa) (Yi — Ys) (%4 — Xs) (Ya — Yo) 
d (X1 — Xg) (Y1 — Ys) (%2 — %4) (Y2— Ys) 
d (%, —X) (Y1 — Ys) (3 — %4) (Ys — Ys) p n® (2, —29) (Yy — Yo) (%2—23) (Y2—Ya) (313) 
d (2 — 2) (Y1 — Ys) (4 — %) (Ys — Ys) q (ay — 2g) (Y; — Ye) (%3 — 2) (Ys — Ya) 
e€ (% — %q) (Y1 — Ys) (%q — Xe) (Ya— Ys) q (2% — 2) (Y1 — Yo) (%3 — Xq) (Ys — Ya) 
f (@ — %2) (Y1 — Ys) (Va — Vs) (Ys — Ys) Pp (%1 — Xe) (Yy — Ys) (%¥g — 2X1) (Ya — YY) 
d (2, — 2%) (Y1 — Ys) (%4 — 75) (Ya—Y1) q (ary — 2g) (yy — Ys) (14 — 22) (Ys — Yo) 
f (2 —%q) (Yi — Ys) (%p — Xs) (Ya — Yo) q (ay — 23) (y; — Ys) (2%, — 2s) (Ya —Ys) 
e (2 — %2) (Y1 — Ys) (%4— 5) (Ya — Ys) 
, (1 — 2) (Ya — Ys) (a 5) (Ya— Ys) rn (a — 9) (Ys —Y2) (%1—%) (Yr —Ya) (2°) 
s (2, —24) (Ya —Ys) (21-22) (Yi — Ya) 
g (ay — 29) (ys — Ya) (71 — 5) (Ys — Ya) (271?) é in, ajo eh, aie a 
° (1 — 2) (Ya — Ya) (5 — 2a) (Ya— 92) t (2-22) (Y1—Ys) (%3— 21) (Ys— Ya) 
a (% — X2) (Ya — Ys) (Xs — X4) (Ys — Yo) t (x — a2) (Yy — Ys) (22— 2s) (Y2—Y1) 
j es es sels ans 
g Hy — Lg) (Yi — Ys) (% 1 — Vg) (Yr — Yas (3) - nies i a Re 321 
h (275 — 24) (Ya Ya) (2a 24) (Yaa) . * Se 
h (% — 2%) (Yy — Ys) (%3 — %4) (Ys — Ya) ° (x Bs $= flo (23-23) (Y3—Yo) 
g (% — Xq) (Yy — Ys) (% — %4) (Yr — Yo) . regi ) (Ys — Yo) (%3— 22) (¥3— 41) 
k (2 — 2X2) (Yy — Ys) (7% — Xe) (Yr — Ya) ~ man (Y4— Ye) (%3— 2) (Y3— 4) 
g (@ — 2X) (Y1 — Ys) (%1 — Xs) (Yr — Ya) u rai yyy -y. ) (2, — Xe) (Y1 — Ye) 
: Se ee u (2, 24) (yi —a) (4 29) (Ys— Ya) 
: (1 — 22) (Ya — Ys) (a 2a) (Ya Ya) v (2, — 2) (Yi — Ys) (3 — 2) (Ys — Yo) 
U (a — %g) (Yy — Ys) (%4— %1) (Ys — Ys) ° (x, — 2») (Y; — Ys) (2—2s) (Ya—Ys) 
™ (2 — %q) (Yy — Ys) (%q — Xe) (Ys — Ys) ~ in, eee one ea (Ye—Ys) 
. (X1 — Xg) (Yi — Ys) fl bon oe a ree 
Ly—- 2 - «4-2 _ 
; ee ee i rm (ay 225) (Ys Ys) (3-22) (YoYo) (41*) 
h (@ — 2) (Yi — Ya) (Xs — 71) (Ys — Ya) 
n (2% — 2) (Y1 — Ys) (3 — 71) (Ys — Ya) x ni (a — 24) (Yi — Yo) (1 — Ze) (Yi — Ye) (42) 





26 


Correlation matrix 
tp 0 0 


F. N. Davip anp C. L. MALLows 


APPENDIX 2 


Area in positive quadrant of quadrivariate normal surface 








1 sin4p\/1_ sin“p 
(; " 2 ) (j " 27 
o, sin x 
ae aes in! ie Ns) Pn ine d: 
sti =f a (cane) “1 
yf pl ale IP sin 2x 
— +— sin-!— +—— sin-! —_——_— |dz 


vd +2cos 22) 


SF ety in 2 
se ate sin? ( a... = 55) de 





12 47 2 27? /(1+2 cos 2x) 

1 os — sin 22 ) 

= 4 gin~t tae ee 

ate mi f - * (ses 5s 
~*te 3sin 2 —sin 32 

+f" iia 4cos 2x ) ax} 

1 in 2 

Sl as 2 -1 ~i 

as (sin-1 p+ 3sin w+alf “s a(S ) ae 


sin $p 2cos 2a—1 
ae in-? { si Bit ted eee 
[ — (sine | os 2a+1) 


p sin~* $p ‘n-t 3+2cos 2x dx 
_ 1+2cos 2a 


1 = win~*p sin x 
— sin! dx 
247 =" +7 3 


0 3 cos 2a 


sin~*$p . 2cos 2x +1 
+ sin-! |sinx dx 
o cos 2a 


Bue 
6*as 3 
Jp. 9 Pee. 34+2cos2z . ) 
a -10 sede -1 d. 
is* on" tail, —_ 14+2cos22" "| . 


sin~* hp 2 27-1 
= e sin-} (sin x ~Sesets) dx 





, 











—~ 5 





The variance of Spearman’s rho in normal samples 27 


APPENDIX 2 (cont.) 











Correlation matrix Area in positive quadrant of quadrivariate normal surface 
0 
ps i” : } 11 pot Ker. a 3+ 2cos 2x 
9 47 me 2* on}, resis ” 14 2cos 2x 
4p 
ip -t -p i , : 4 oe 
Bat hcl - - =e -1 
os ( — 367 ae E (Sein $p—sin-!p)+ L sin 
tp ing [208271 \ ay [sina (82 
i gums 3(2 cos 2x + 1) - 0 soled 3 an 
J 0 
“ 3 t in + pices nd PO 
. p atm 7 cos 2x 
+p 
p -$ -tp 1 1 mi ee] (3*) 
+ sin- sin-? +7 sin-! | ——] dx 
r (<u -4 56a sal w 0 3 
4p 2 a 2cos 2a—1 
. 0 weed? op" 3(2 cos 2a + 1) - 
=~" (3+ 2cos 2x) 
a 3+ 2cos 
+f sin (sin ose) as| 
p + 0 1, , , 1 {f we. _, (sine 
== + — (sin- d. 
. kp 0 13+ - -_ p+2sin tp)+75 sin 3 ae 
sin—* 4 a 
4p -| "sin (sine [PO *) az 
0 3 cos 2a 
oo . (3 2cos 2a+1 
+ sin-! sine / ——— } dx 
0 cos 2x 
pP + Te , 
-4— (gin- in-1 —~_S(atn-1 9)* — (ain-1 40)2 
r tp 4 5 gy Sin + sin Bp) + Ff (sin™ p)* — (sin $p)*} 
p 
1 p 
= 4 — gin-1h 
(4) 4°27" 2 
3p -+ 4p ' 
t -p -4 at ge (Bsin“H4p—sin-*p) += {(sin 4p)*— (sin) 
2p 
p FR 
u ( 4 6tan™ 3+7 sin! p 
p -t tp , 
v —tp 4} 1s + & (sin p+ sin 4p) + {(sin-p)*—(sin-* 4p) 
2p! 
> == -# 1 
w -tp -3 36 +<-(sin1p— sin-! bp) +4 {(sin- 1p)? — (sin! $p)?} 
p 
pw 
ad (p) a+3,%n"P 





28 


‘= 
ak 


‘£ 


‘- 
“z 








a sin 22 dx 3 rs 22 _ 71 hg eet ns 23.112 124. 
- ne > TELE = _— — —— x eee 
V(1+ 2cos 2x) v 3 3% 39.5 34.5.7 37.5?.7 35. 5?.7 
sin x a a 2 ll 1999 2°.1009 
-1 iy eae ek aw ale 10 12 
” (sos.) 2tots” +107 *39.5.77 * 3063.7" 
(sing _ 8g 2.67, 2.15,437 24.67.2171 
oe \1 + 200s Qe 2.338 ° 38" * 37.5 31.5.7 311, 52.7 - 
. _, {sine a af 28 22 2 
" ‘( 3 ees Rte a et 
( " 2cos 2a—1 d 2 3.6 2.7 2.57.7 
- comnimrmmuarighainbanssmemant bi —— -_ _ 
sun (8 | 3(cos2x + 1) 2.3. 38 38 37 
22,437,671 ._ 22.43,133 
=” @.s° *~ 
sn-1 (sin 22082243) ap — 5 pe Tye, 2-18 , 22-1849 
Se ee scons 2a+1) 2.8 38 35 37.5 
2°.143,723 ,  29.11,014,679 
oe + 
3,5 3.52.7 
sinx a a 2 2 73 gi2 
in-! dz = ee re 10 — 
ie (=) ” 3(55 22,38 22,38,5  24.34.5.7  24.37.52.7" + 28.38.53,77 ) 
2cos2x-—1 a? 5 49 221 
-1 n,n a eee 
= (sine 3cos 2x ) ee (75 2.39 38.5" 2.38.7 
138,767 917 
am ane ee a 
37.6.7" 38.5" +.) 
2cos2a4+1 a2 et 31 743 
snl | at i oe i Ps oe alk ee 
in (sinz cos 2x ) ae 3(S+agtes +3738.5.7" 
11,017 24,217 
3.5.7". eat) 
ein-! sin 2a aie tenes 7 a4 343 58,111 PP - 
™ \2J(cos 2x)) “ ~ 2° 987 20.3” 7 97.3.5" *98.39.5.7° 210,38, 58” 
ae 3 sin # —sin 3x wees 293 a+ 2495 wi0 1,444,997 at4 
” 4.cos 2x = 927 987 96.3.5° °25.93.7° * 2°.39.52.7 











F. N. Davip anp C. L. Mattows 


APPENDIX 3 


Integrals as series expansions 






























































aan em «2 FF Fito 











Biometrika (1961), 48, 1 and 2, p. 29 99 
Printed in Great Britain 


Tests for rank correlation coefficients. IT 


By E.C. FIELLER anp E. 8S. PEARSON 
Statistical Advisory Unit, War Office, and University College London 


(This paper must, I fear, be introduced with a short explanatory statement. Before his untimely 
death on 1 December 1960, Edgar Fieller had organized and seen completed all the heavy calculations 
for the distributions of the coefficient r, and its transform z,. Publication was delayed because as far 
as he and I were aware there were no theoretical results for the mean and variance of r,. However, 
since the empirical results establish clearly the utility of z, from the practical point of view it seems 
now desirable to put these results on record, and to include in the same paper the new results for 
the variance of rs. For the arrangement of the material and its discussion I must accept responsibility, 
but it should be stated that of the three authors of part I of this series it was Fieller who had from the 
beginning insisted on the inclusion of the r, and z, coefficients in the investigation. He was un- 
doubtedly right in pressing this point. £.s.P.) 


1. INTRODUCTION 


In the first paper in this series, which will be described as part I (Fieller, Hartley & 
Pearson, 1957), 25,000 sets of correlated random normal deviates were used to examine 
the distributions of Spearman’s and Kendall’s rank correlation coefficients, rg and rx, in 
samples of n = 10, 30 and 50 observations. In particular, it was shown that, as in the case 
of the product moment correlation coefficient r,,, R. A. Fisher’s transformation 


2 = tanh"'=r = blog, +" (1) 
had a remarkable normalizing property when applied both to rg and rx. In this respect, 
andalsoasregards power of discrimination, it appeared that there was little to choose between 
these two coefficients. It was also pointed out that the conclusions reached would apply 
not only to rankings generated by sampling from a bivariate normal parent having correla- 
tion p, but to a much wider class of parental distributions which have the property of being 
convertible to the normal by monotonic transformations applied to the marginal dis- 
tributions. 

The present part IT contains: (a) A short discussion of the more exact results for the 
variance of Spearman’s rg derived from the work of David & Mallows (1961). (6) Results 
for the Fisher—Yates coefficient r,, which had been incomplete in 1957. (c) Some comparison 
of the distributions of the three transformed coefficients zg, zx and zy. (d) A numerical 
illustration. 

2. THE VARIANCE OF SPEARMAN’S COEFFICIENT fg 


The results given in Table 1 have been calculated from equation (X) for var (rg) given by 
David & Mallows (1961) on p. 22 of the preceding paper. The computation involved was 
carried out by Barbara Snow in the Department of Statistics, University College; as a 
check on arithmetic she also used the David & Mallows alternative formula (Z) involving 
the expansion of S, = sin-!p and S, = sin—! 4p in powers of p. From the computing point 
of view it is of interest to note that the results based on (X) and (Z) agreed to the five 
decimal places tabled, except for the case p = 0-9 where there were differences of 2 units 
(n = 10) and 1 unit (nm = 39, 50) in the last place. 





30 E. C. Frecuer AND E. 8S. PEARSON 


Table 1. Theoretical values of var (rg) 




















Pp n= 10 | n = 30 n = 50 
0-1 0-10974 0-03399 0-02010 
0-2 10564 -03250 -01920 
0-3 09886 -03006 -01771 
0-4 08947 02672 -01568 
0-5 07762 | -02258 -01317 
0-6 0-06353 | 0-01777 0-01028 
0-7 04762 | 01254 00715 
08 03058 | .00727 | 00404 
0-9 01363 -00260 | -00137 
| 





Since the a, 8 and y terms of (X) consist of expansions in powers of S, and 8, the curtail- 
ment of these series at the 12th power must involve some inaccuracy. No precise estimate of 
this error is possible without further investigation. David & Mallows considered the cases of 
p = 0-5 and 0-6 numerically. If we take p = 0-9, it is found that it is the £-term which con- 
verges most slowly. For = 10, p = 0-9 the full contributions to var (rg) from this term are: 

From terms in: S3 S$ S§ S§ a. & 

+0-062465 9552 2635 1126 388 183 (all positive) 


Without exploring further, one might hazard a guess that in this case the calculated 
variance may be too small by 2 or 3 units in the 4th decimal place. For n = 30 and 50, the 
error should be no more than | or 2 units in the 5th place, even when p = 0-9. With this 
caution, it is justifiable to table results to 5 decimal places throughout. 

The figures for var (rg) in Table 1 may be compared with those given in Table 1 of part I. 
It will then be noted that: 

(a) As expected, the Kendall approximation breaks down for high p and low n. 

(b) The results of our experimental sampling conform with the improved theoretical 
values. 

(c) The smoothing of the experimental values, used in the further analysis, was on the 
whole successful in providing estimates of the true values. 

With regard to (c), we had used these smoothed values to calculate approximations to 
the mean and variance of zg and zx, obtained from 








&(z) = tanh-7-4+ 77 ie (2) 
var (z) = TH (3) 


The substitution of the true value of var (rg) therefore makes no appreciable alteration 
in the figures given in Tables 2 and 3 of part I. It may therefore still be concluded that: 

(i) The two terms of the expansion of equation (2) are adequate to give the expected 
value of zg to all necessary accuracy, if n > 10, p < 0-9. 


+ The changes in var (r,) from the earlier smoothed values bring the observed and theoretical 
values of Z, closer together. 











pai 


g) 





es es 





Tests for rank correlation coefficients. II 31 


(ii) The single term of equation (3) is not adequate to give var (zg). For example we find: 


n= 10 n = 30 n = 50 

Mean value of From (3), using smoothed var (7) 0-1224 0-0370 0-0219 
var (z,) for From (3), using true var (7g) +1216 -0374 -0221 
p = 0-1-—0-8t From sampling experiment -1487 -0388 -0228 
From empirical estimate, 1-060/(n—3) +1514 -0392 -0226 


There appears to be no reason, therefore, to alter the purely empirical suggestion of 
part I, namely to take 


1-060 1-03 
var (zg) = a. 7 Jn—3)" 


3. THE DISTRIBUTIONS OF THE FISHER-YATES COEFFICIENT fp 
AND ITS TRANSFORM Zp 


(3-1) Definition of rp 
In the notation of part I, we have n pairs of associated rankings 


as My --s, RB GE Oy, Dy cos By 


where the integers u; (i = 1,2,...,) may be taken in ascending order 1, 2,..., and the 
v; are a permutation of these integers. If we then attach to both these rankings the appro- 
priate normal order statistics £(i|n), ie. the expected values of the ith largest standardized 
deviate in a sample of size m from a normal population, the Fisher—-Yates coefficient (1938, 
pp. 14-15) is A a 
re = 3 Stil) &lv4lm) / = Bll. (4) 
Tables of £(i|n) have been given in several publications (e.g. Fisher & Yates, 1938, Table 


XX; Pearson & Hartley, 1954, Table 28; Harter, 1961); Fisher & Yates (1938, Table X XI) 
give the expressions > £7(i|n) for n = 2(1) 50. 
i 


(3-2) Test of the hypothesis that p = 0 
Whatever be the form of the parent distribution, it follows from Pitman (1937) that if 
the two variables are independent 


sh) =, (5) 


a,___ 1 {8(n—1) , (n—2) (n—3) (ky\? 
£4) = aap oe aes n(n? — 1) (a) }. e 


where k, and k, are the 2nd and 4th cumulants of the £(i|7), i.e. 





1 
ky = De(ilm), 
caer (7) 


n 


. 3(n— 1) ' 
k= (n—1) (n—2) (n—3) {(n+ HEE) -—— b> exiln)p| ; 


+ As before, in taking averages values for p = 0-9 have been omitted. 





32 E. C. Fretuter AnpD E. S. Pearson 


The odd moments of 7, are of course zero. The second term within the braces in (6) will 
tend rapidly to zero as n increases. For example, when n = 10 it is found that 


k,/k3 = —0-4911 


so that the two terms inside the braces become 2-4545 and 0-0136, respectively. Thus the 
second term is effectively negligible even when n = 10. As Pitman pointed out, 1/(n—1) 
and 3(n—1)/(n+1) are the variance and mean of the distribution 


p(r) = constant (1—r2)k@- ~(-1<r< 1). 


It follows that without much loss of accuracy, in testing whether p differs from zero we 
may treat r, as though it were the product moment correlation, r,,,, of n pairs of indepen- 
dently distributed normal variables. This result was implied, though not specifically 
stated, on pp. 14-15 of Fisher & Yates’s Introduction (1938) and was again referred to by 
Hoeffding (1951, p. 88). 


When p + 0, the distributions of r,,, and ry will not be as similar; for example 


E(rp) < E(rzy). 


(3-3) The distributions of ry, and their mean and variance 


As in the case of Spearman and Kendall coefficients, values of r,, defined by equation (4) 
were computed for each of the 2500 samples of 10, 833 samples of 30 and 500 samples of 50. 
The resulting distributions are not published here, but are available should any statistician 
wish to make use of them in exploratory work. Table 2 shows the mean and variance of 
each of the 27 distributions, n = 10,30, 50; p = 0-1 (0-1) 0-9. 


Table 2. Mean and variance of r,. Experimental values 





























n= 10 n = 30 n = 50 

pP —_ — ————EEE a — 

Mean | Variance Mean Variance Mean Variance 
0-1 0-0868 | 0-10639 0-0965 0-03278 0-:0959 0-01887 
0-2 1740 -10269 -1906 -03285 -1904 -02143 
0:3 -2640 | -09833 +2857 -03032 +2909 -01866 
0-4 +3550 | -09018 +3857 -02506 -3896 -01512 
0-5 -4380 -07778 *4719 -02152 -4806 -01228 

| 

0:6 0-5387 0-:06063 0-5725 0-01603 0-5837 0-009582 
0-7 | *6415 | -04221 -6760 -009859 -6863 -005465 
0-8 *7298 | -02952 -7766 -005240 -7856 -002807 
0-9 | +8457 | -01161 *8796 -001954 *8873 | -000896 








As far as is known, the true moments of r, for any p have not been found although there 
is little doubt that with sufficient effort they could be obtained following the approach of 
David & Mallows (1961). We have therefore proceeded to an empirical examination of the 


distribution of z,, each experimental value of ry being converted by the transformation of 
equation (1) and the results tabled. 


a at 














1e 


1) 


4) 


of 


re 
of 
he 
of 


ms 


Tests for rank correlation coefficients. II 


33 


Table 3. Comparison of observed and theoretical values for the mean and variance of zp 
























































{ 
Smoothed Smoothed 
var (rp) Experi- Experi- 

p aaa) mental mental | tanh p 

‘3 a at var(zp) . |?rvar(rr)| Cols. 2p 

r. tanh-7p | ——— — 

CY. foe see" aap | (6) + (7) 
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) 

n= 10 
0-1 0-0868 0-1076 0-109 0-1368 0-087 0-009 0-096 0-0966 0-1003 
0-2 1743 1034 -110 1370 -176 019 -195 “1971 +2027 
0-3 +2626 0980 113 1426 269 -030 -299 +3041 -3095 
0-4 *3518 -0897 “117 1482 +367 041 -408 -4185 -4236 
0-5 +4423 0773 “119 1465 “475 053 -528 -5268 +5493 
0-6 | 0-5346 0-0619 0-121 0-1410 0-596 0-065 0-661 0-6709 0-6931 
0-7 -6300 -0450 +124 -1349* *741 -078 *819 -8401* -8673 
0-8 -7320 0282 131 *1543* -933 -096 1-029 1-0290* 1-0986 
0-9 -8460 -0121 -150 -1654* 1-242 *127 1-369 1-3637* 1-4722 
n = 30 
0-1 | 0-0950 0-03370 | 0-0343 0-0360 0-095 0-003 0-099 0-1004 
0-2 -1902 -03260 0351 0379 -193 007 -199 -2001 
0-3 +2854 *03025 -0359 0384 294 -010 +304 +3050 
0-4 +3810 -02600 -0356 0359 401 ‘014 415 -4193 
0-5 -4770 *02125 0356 0382 -519 017 -536 -5300 
0-6 | 0-5739 0-01585 0-0352 0-0358 0-653 0-020 0-674 0-6706 
0-7 *6724 -01000 0333 0331 “815 022 *837 *8432 
0-8 -7740 -00525 *0327 0342 1-030 025 1-056 1-0625 
0-9 “8804 00195 0386 -0397 1-377 034 1-411 1-4085 
n = 50 

0-1 0-0967 0-02035 0-0207 0-0199 0-097 0-002 0-099 0-0988 
0-2 *1935 -01970 0213 0240 -196 -004 +200 -1969 
0-3 +2905 -01825 -0218 +0228 +299 -006 -305 +3063 
0-4 +3880 -01565 0217 0217 -409 -008 -418 -4198 
0-5 +4859 *01255 0215 -0216 531 ‘010 -541 -5338 
0-6 | 0-5845 0-00910 0-0210 0-0221 0-669 0-012 0-681 0-6809 
0-7 *6841 -00565 -0200 0204 *837 -014 *851 *8541 
0-8 "7852 -00281 ‘0191 0195 1-059 0-015 1-074 1-0745 
0-9 *8890 “00090 0205 0198 1-417 -018 1-435 1-4264 



































* For n = 10, the few infinite values of z have been ignored for p = 0-7, 0-8 and 0-9. 


3 


Biom. 48 








34 E. C. Frevuer and E. S. Pearson 


To improve the approximations, (2) and (3), to &(z,) and var (z,), the observed values 
of r,and var (r,,) were smoothed. This was done graphically, making use of the known values 
of these statistics corresponding to the terminal values p = 0 and 1. The results are given 
in cols. (2) and (3) of Table 3. 

(3-4) The variance of zp 

Clearly the most important function of the z-transformation is to provide approximate 
normality and a variance as nearly as possible independent of the population p. We shall 
consider first the variance. It is clear from a comparison of cols. (4) and (5) of Table 3 that, 
as in the cases of zg and zx considered in part I, the approximation of equation (3) is far 
from adequate when n = 10 and still giving too small values when n = 30. As the true 
mean and variance of ry are at present undetermined, this is of no great consequence. 

What is of much more importance is that the experimental values of var (z,) have an 
average of nearly 1/(n — 3), i.e. the same value as is used for the transformed product moment 
correlation coefficient, z,,,. Thus we find: 


n=10 n = 30 n = 50 

Average of estimates of var (z,») ( Approx. (3) smoothed 0-118 0-0347 0-0209 
for p = 0-1—0-8 , Experiment 143 0362 0215 
1/(n—3) 143 -0370 0213 


Certainly the variance will not be exactly constant, but the sampling error obscures such 
trend with p as is present. We may note that the variance of the product moment z,,, as 
given by Gayen’s (1951) amendment to the original expression of Fisher takes the following 
values: 





\n 

p i 10 30 50 
0-1 0-141 0-0370 0-0213 
0-5 139 -0368 -0212 
0-9 *134 -0365 -0211 


(3-5) The expectation of zp 

Column (9) of Table 3 gives the experimental values of z,,; the standard errors are, approxi- 
mately: for n = 10, 0-008; for n = 30, 50, 0-007. Cols. (6), (7) and (8) were obtained by in- 
serting the smoothed values of 7, and var (r,) from cols. (2) and (3) into the approximation 
to &(z,) of equation (2). It seems probable that this expression would be adequate were 
the true values of 7, and var (r,,) known. 

Because of the very small change in &(z,|p,n) with n, a knowledge of this function is not 
in general required in using tests of significance. As, however, it might be needed in con- 
nexion with estimation, it was thought that a purely empirical formula (with no theoretical 
significance) would be worth recording. For simplicity this will need to be expressed directly 
in terms of p and m rather than the unknown 7, and var (r,;). 

Column (10) of the top section of Table 3 gives values of tanh~!p. It is seen that the 
experimental means lie between these values and tanh~!7,. This suggested that an expres- 
sion of the form tanh-' p’, with p’ < p, might be used. After several attempts, it was found 
that a good approximation could be obtained by putting p’ = p{1 —0-6/(n + 8)}, ie. taking 


E(zp) ~ tanh-!p (1 - 3) ; (8) 


n+8 














= 


'y 


s 





Tests for rank correlation coefficients. Il 35 


Table 4 compares this approximation with the figures obtained previously by inserting 
smoothed values of 7, and var(r,) into equation (2). The agreement certainly appears 
adequate for practical use and is likely to hold to the same order of approximation for 
intermediate values of p and n. 


Table 4. Comparison of (i) tanh—!(p’) = tanh-* lp (1 - —")| and (ii) the 


approximation €(z,) = tanh 7 ,+7, var (7 ,)/(1 —7%) (smoothed) 











n= 10 n = 30 n= 50 

p 

tanh (p’) E(zr) tanh~*(p’) E(zr) tanh-"(p’) | E(zr) 
0-1 0-097 0-096 0-099 0-099 0-099 0-099 
0-2 -196 “195 -199 -199 -201 -200 
0-3 +299 +299 -304 +304 -306 -305 
0-4 -408 -408 -416 “415 *419 -418 
0-5 *527 -528 +539 -536 +542 -541 
0-6 0-662 0-661 0-673 0-674 0-684 0-681 
0-7 +823 “819 -846 *837 *853 “851 
0-8 1-029 1-029 1-065 1-056 1-076 1-074 
0-9 1-333 1-369 1-402 1-411 1-425 1-435 





























(3-6) Shape of the distribution of zp 
Values of the moment ratios ,/b, and 6, were calculated for all distributions except the 
three containing infinite values of z (i.e. n = 10; p = 0-7,0-8,0-9). They are shown in 
Tables 5 and 6, together with the corresponding values for zg.t The large sample approxi- 
mations to the standard errors of ,/b, and b, in sampling from a normal distribution, namely 


(6/N) and ./(24/N), give 


N = 2500 for N = 833 for N = 500 for 
n= 10 n = 30 n = 50 
a(/b,) 0-049 0-085 0-110 
a (be) 0-098 0-170 0-219 


Except perhaps in the case of samples of n = 10, there is no consistent evidence of 
asymmetry in the distributions of either zg or zy». On the other hand the distributions are 
clearly leptokurtic, with £, > 3. This effect seems to be appreciably the same for both zg 
and z» and is definitely rather greater than that for the transformed product moment 
coefficient z,,,. The £, values for the last, averaged for p = 0-1—0-8, are shown at the bottom 
of Table 6. 

To obtain some idea of the effect of this amount of kurtosis on tests of significance (based 
on the assumption of normality), we may suppose that the distribution of z, is represented 
by a symmetrical Pearson curve (a type VII or ¢-distribution) with the same /, values. 


+ (It appears that the values of these statistics were not completely calculated by f.c.¥. for 
Kendall’s coefficient. 4.s.P.) 


3-2 





E. C. FreELLER AND E. S. PEARSON 


Table 5. Values of ./b, for distributions of z 

































































Zs (experiment) zr (experiment) P 
ay 
p forn = 10 
h * 
a = 30 n = 30 n = 50 n= 10 n = 30 n= 50 wey? 
0-1 —0-17 0-03 0-01 —O11 0-00 | —0-04 0-000 
0-2 -02 — 07 — +13 — -02 — -04 — -12 -000 
0-3 09 — -03 05 06 — -04 — +03 -001 
0-4 18 — 02 -10 14 — +17 01 002 
0-5 21 26 24 +12 14 “12 -005 
0-6 | 0-17 — 0-08 — 0-07 0-11 —0-16 — 0-09 0-008 
7 | — — 07 12 —_— — -07 -08 -013 
oj; — 07 -04 — -09 -05 019 
0-9 | _ “17 -00 _ +22 -04 -027 
* From the Fisher—Gayen formula. 
Table 6. Values of b, for distributions of z 
Zs (experiment) zp (experiment) 
Pp 
n= 10 n = 30 n= 50 n= 10 n= 30 n = 50 
0-1 3-641 3-398 3-024 3-427 3-360 3-242 
0-2 3-487 3-103 2-940 3-325 3-218 2-814 
0:3 3-308 3-302 3-335 3-237 3-377 2-946 
0-4 3-353 3151 2-996 3-323 3-092 3-017 
0-5 3-543 3-297 3-163 3-692 3-228 3-010 
0-6 3-267 3-163 3-297 3-397 3-316 3-326 
0-7 _ 3-224 3-043 oo 3-265 2-791 
0:8 — 3-036 2-641 se 2-925 2-791 
0-9 — 3-083 2-437 — 3-125 2-644 
ioum 3-433 3-209 3-055 3-400 3-223 2-992 
p = 0-1-—0-8* 
Mean for 3-274 3-074 3-043 - = we 
Zeyt 


























* For n = 10, the mean is only for p = 0-1—0-6. 
+ From the Fisher—Gayen formula. 


he 
sn 


th 





Tests for rank correlation coefficients. II 37 


Table 42 of Pearson & Hartley (1954) shows the following modification of standardized 
percentage points, according to £, (when f, = 0): 


5% 25% «10% = 05% 2M 


Normal curve, f, = 3-0 1-64 1-96 2-33 2°58 2-81 
Type VII #, = 32 1-64 1-97 2-37 2-65 2-91 
B, = 34 1-64 1-98 2-40 2-71 3-00 


The effect only becomes of importance at extreme significance levels and, presumably, 
has never been taken into account in the practical application of the z,,, transformation in 
small samples. 

Normal curves were fitted to the distributions and the values of y? obtained were of about 
the same average size as those given for zg and zx in Table 7 of part I. 


4. THE SENSITIVITY OF RANK CORRELATION MEASURES TO CHANGES IN p 


In § 5 of part I brief consideration was given to the question of power. It was pointed out 
that the power of discrimination of any one of these alternative correlation measures would 
depend on the rapidity with which its sampling distributions ‘drew clear’ of one another 
as the population value of p changed. It was suggested that a rough measure of local sensi- 
tivity would be given by the ratios (z,—2,)/,/(s?, + 82.) of 

(a) the difference between pairs of consecutive entries for Z, in col. (9) of Table 3 (or 
corresponding values for Zg and Z, in part I), and 

(b) the square root of the sum of the corresponding pair of variances from col. (5) of the 
same table. 


Table 7. Sensitivity ratios (2, —2,)/\/(82, + 82,) for zp 





























Pr Pe n=10 n = 30 n = 50 
0-1, 0-2 0-192 0-367 0-471 
0-2, 0-3 *202 -380 -506 
0-3, 0-4 *212 -419 +538 
0-4, 0-5 -199 -407 -548 
0-5, 0-6 0-269 0-517 } 0-704 
0-6, 0-7 +322 -658 | -840 
0-7, 0-8 *351 +845 1-10 
0-8, 0-9 -591 1-35 1-78 





Table 8 of part I compared these sensitivity ratios for the z transforms of the product 
moment and of Spearman’s and Kendall’s rank correlation coefficients. Table 7 gives 
similar ratios for the Fisher—-Yates coefficient. On comparing the two tables it will be 
found that with very few exceptions, which are probably the result of sampling fluctuations, t 
the sensitivity ratio for zy lies between that for z,,, and those for zg and zx. In so far as the 
variances of the z’s are nearly constant, the sensitivity in detecting differences in p greater 
than 0-1 will be found by adding two or more of the consecutive local ratios. The superiority 


+ Results for 0-25 % from a table in preparation by N. L. Johnson & E. Nixon. 
t As in part I, the ratios are calculated from the unsmoothed experimental means and variances. 





38 


E. C. FIELLER AND E. S. PEARSON 


of zy over Zg and zx will then stand out further still. It follows from this evidence that for 
the population model considered a test based on the Fisher—Yates coefficient is more likely 
to detect differences in the population p, if they exist, than tests based on the other two rank 
coefficients. This empirical result confirms what would be expected on theoretical grounds. 


5. A NUMERICAL EXAMPLE 


In Table 8 there are shown for each of four samples, n, pairs of associated rankings 
(¢ = 1,2, 3,4), where n, = 15, n. = 12, nz = 19 and nm, = 14. Assuming that the underlying 
ranked variables x,,, y,; follow four bivariate normal distributions, are the results consistent 
with the hypothesis that these parent distributions have a common coefficient of correla- 
tion p? If the parent distributions are not bivariate normal, it is assumed that they each 


Table 8. Comparison of four tests for heterogeneity of correlation 





‘=u 


| 
| 
| 


comonon rr Wo = 





n 
i.) 


Values of v, 








Ist sample | 2nd sample |. 3rd sample | 4th sample 





























nm, = 15 N, = 12 N, = 19 ng = 14 
9 2 3 3 
ll 3 S 1 
4 6 1 4 
7 4 9 2 
2 1 2 7 
5 12 10 6 
1 9 5 9 
10 8 6 S 
13 5 18 5 
14 10 4 13 
6 11 19 10 
15 7 ll 12 
12 io 16 11 
3 aes 17 14 
5 fa 15 aes 
otis aden 13 = 
tinal ai 7 ike 
les —— 14 ‘alien 
saab ae 12 ecu 
0-211 0-622 0-583 0-894 
143 424 -368 714 
164 552 518 875 
“161 -500 562 -864 z a 
0-214 0-729 0-666 1-444 0-743 8-36 
“144 453 -387 0-896 454 7-04 
-166 -621 573 1-354 -659 8-36 
“163 -550 -636 1-308 -656 7-70 

















hai 

















Tests for rank correlation coefficients. II 39 


have the property of being convertible to the normal by monotonic transformations applied 
to their margins. We should then be testing whether the distributions, after transformation, 
had a common p. 

We can now proceed by using any one of the three rank correlation coefficients rg, rz 
or 7, applying the z-transformation to the coefficients calculated for each sample and then 
determining 


4 
x? = ¥ w(z,—2)? with v = 3 degrees of freedom, (9) 
‘=1 
4 4 
where z= > wf yw (10) 
i=1 t=1 


and w, is the reciprocal of the variance appropriate to the type of coefficient chosen. That is 
to say, using the empirical results: 


for Spearman’s coefficient w, = (m— 3)/1-060, 
for Kendall’s coefficient W, = (n,— 4)/0-437, 
for the Fisher—Yates’ coefficient w, = m3. 


At the bottom of Table 8, the correlation coefficients are shown with their z-transforms 
and also the mean z of equation (10) and the x? values of equation (9). The rankings were in 
fact based on samples of 15, 12, 19 and 14 taken from Fieller, Lewis & Pearson’s T'ables of 
Correlated Normal Deviates (1955), the parent correlations being different, namely 

Pp, =O1, p,=0-3, p,=0-6, py = 07. 

We have also, therefore, shown the four values of r,,, and their z-transforms and the 
resulting weighted mean z and y?. For v = 3 degrees of freedom, the 10% point of x? is 
at 6-25, the 5 % at 7-81 and the 2-5 % at 9-35. All four values are therefore on the borderline 
of significance. The x?’s for the Spearman and Fisher—Yates coefficients have come out with 
the top value of 8-36, followed by that for the product moment coefficient with 7-70 and 
Kendall’s coefficient with 7-04. 

It must, of course, be emphasized that the ordering of x? in a single set of samples will 
not necessarily agree with the order of efficiency with which the tests would be expected to 
detect differences in the p-values in the long run. For given differences in p, we should expect 
that x” based on z,,, would on the average be greatest, the x” based on z, coming second. 

From the point of view of computation, the coefficients rg and rz are obtained most 
easily while r,,, involves the longest calculation; after this stage, the procedure is in all cases 
the same apart from the difference in weights. 


6. Discussion 


The results discussed in the two papers of this series apply to rankings generated by 
sampling either from (a) a bivariate normal distribution, or (5) a class of bivariate distribu- 
tions which are convertible to the normal form by monotonic transformations applied to 
the margins. In case (5), p is the coefficient of correlation in the transformed, not the original 
distribution. The properties of the class under (b) which contain a wide range of skew 
bivariate forms have not been extensively explored (but see Johnson, 1949). 

On theoretical grounds it would be expected that when dealing with a bivariate normal 
parent, rank order tests based on ry would be more powerful than those based on rg. The 
coefficient 7, based on paired comparisons falls in a rather different category. 





40 E. C. FIeEvLER AND E. 8. PEARSON 


By making use of the z-transformation, all three coefficients are put into a simple com- 
parable form. In all three cases the distributions of z have variances nearly independent 
of p, are closely symmetrical but slightly leptokurtic. 

The advantage of the Fisher—Yates coefficient z, over the other two lies in its somewhat 
greater power of discrimination, and in the fact that the sampling variance of zy seems to 
have nearly the same value as that of z,,, i.e. 1/(n—3)—an easy expression to remember. 
A disadvantage, which certainly carries some practical weight is that a table or tables are 
needed to calculate r,.t Thus the test does not fall within the class of quick methods which, 
to quote a description once used by Student, can be worked out in a railway train on the 
back of an envelope. 


As in connection with part I, the authors have been greatly indebted to Miss M. U. 
Thomas, Mrs Esmé Hill and Mr T. Vickers for extensive help in computation. 


REFERENCES 


Davin, F. N. & Mattows, C. L. (1961). The variance of Spearman’s ‘rho’ in normal samples. Bio- 
metrika, 48, 19-28. 

Freier, E. C., Hartitey, H. O. & Pearson, E. 8. (1957). Tests for rank correlation coefficients. I. 
Biometrika, 44, 470-81. 

Freier, E. C., Lewis, T. & Pearson, E. 8. (1955). Correlated random normal deviates. T'racts for 
Computers, no. XXXVI. Cambridge University Press. 

FisHer, R. A. & Yates, F. (1938). Statistical Tables for Biological, Agricultural and Medical Research 
(5th edition, 1957). Edinburgh: Oliver and Boyd. 

GayveEn, A. J. (1951). The frequency distribution of the product-moment correlation coefficient in 
random samples of any size drawn from non-normal universes. Biometrika, 51, 219-47. 

Harter, H. L. (1961). Expected values of normal order statistics. Biometrika, 48, 151-165. 

Hoerrpine, W. (1951). Optimum non-parametric tests. Proceedings of the Second Berkeley Sym- 
posium on Mathematical Statistics and Probability, pp. 83-92. University of California Press. 

Jounson, N. L. (1949). Bivariate distributions based on simple transformation systems. Biometrika, 
36, 297-304. 

Pearson, E. 8. & Hartiey, H. O. (1954). Biometrika Tables for Statisticians. Cambridge University 
Press. 

Prrman, E. J. G. (1937). Significance tests which may be applied to samples from any populations. 
II. The correlation coefficient. J. R. Statist. Soc. Suppl. 4, 225-32. 


+ A table is of course required in all cases to convert fr to z. 








Prir 











Biometrika (1961), 48, 1 and 2, p. 41 41 
Printed in Great Britain 


Some methods of constructing exact testst 


By J. DURBIN 
Research Technique Division, London School of Economics 


1. INTRODUCTION 


For testing the hypothesis that a set of observations 2,,...,2,, is a random sample from 
a distribution having a specified continuous distribution function F(x) a variety of pro- 
cedures are available, the most celebrated being Pearson’s x? test. The x? test is flexible and 
easy to use but it possesses an element of arbitrariness in the choice of group boundaries; 
moreover, it does not have the desirable property of being an exact test in the sense of giving 
exact probabilities of rejection when the null hypothesis is true. The Kolmogorov test, which 
has been described and compared with related tests in an excellent article by Darling (1957), 
is free from these objections and is known to have good asymptotic power properties against 
alternatives specified in terms of distance between distribution functions. Nevertheless, in 
my experience its performance in practice with samples of moderate size has been dis- 
appointing in that it has frequently failed to give a significant result when the x? test has 
done so. Some empirical results which illustrate this tendency will be given later in the paper. 

The reason for the disappointing performance of the Kolmogorov test is not far to seek. 
If the alternative distribution is such that the difference between the hypothetical distribu- 
tion function and the alternative distribution function is relatively large for some value of 
the variate, one would anticipate that Kolmogorov’s test would have high power. Such 
would be the case, for example, if the two distributions differed only in location. If, however, 
the difference between the two distribution functions is nowhere large, as is quite possible if 
their means and variances are the same even though the two frequency functions differ 
markedly in shape, one would not anticipate that Kolmogorov’s statistic would be a 
powerful discriminator between the two hypotheses. 

Putting u; = F(x;)(j = 1,...,n), the hypothesis that 2,, ...,2, have distribution function 
F(x) is equivalent to the hypothesis that w,,...,u,, are U(0, 1) variables, i.e. are uniformly 
distributed on the interval (0,1). Let u,,...,u, be ordered to give the ordered values 
Uy) < Ug < ... < U). The first part of this paper shows how to transform uw), ..., U%_) to a new 
set of values w,, ...,w,, which have the same joint distribution as uw, ..., %,) When the null 
hypothesis is true but which give a set of points tending to be more heavily concentrated 
towards the left-hand end of the (0, 1) interval than towards the right-hand end under a wide 
class of alternatives. A variety of procedures can now be used for testing the hypothesis that 
W, ...,W,, are ordered U(0, 1) variables, special interest being attached in this paper to the 
one-sided Kolmogorov test. An indication is given in §6 of applications to some problems 
other than those of goodness-of-fit, namely, to tests based on the periodogram in time-series 
analysis and to tests of the distribution of the intervals between successive events. 


{ This paper is based mainly on research done at the University of North Carolina, Chapel Hill, 
with the support of the Office of Naval Research. The principal results were presented in an invited 
address delivered to the Institute of Mathematical Statistics and the Biometric Society at New York, 
26 April 1960. The work was completed and the paper written at Stanford University with the support 
of the National Science Foundation. 





42 J. DuRBIN 


Because of wide applicability to problems in statistics and probability as well as for 
purely mathematical interest, distributions related to points scattered at random in an 
interval have been extensively studied in many different guises. No attempt will be made 
here to review the entire field; the following list of references is intended merely to indicate 
work which seems to me to be fairly closely connected with the subject matter of this paper. 
Among contributions concerned specifically with the random division of an interval we may 
mention those by Whitworth (1887), Moran (1947, 1951), Sherman (1950), Mauldon (1951), 
Darling (1953), Irwin (1955) and Barton & David (1955, 19564, b). The following papers are 
concerned with goodness-of-fit aspects: K. Pearson (1933), Neyman (1937), E. S. Pearson 
(1939) and Darling (1957). Authors dealing with similar problems arising in the study of 
exponential distributions are Sukhatme (1936, 1937), Epstein & Tsao (1953), Epstein & 
Sobel (1954), Bartholemew (1957) and Epstein (1960). For applications to Poisson processes 
we mention the papers by Greenwood (1946), Maguire, Pearson & Wynn (1952, 1953), 
Barnard (1953), Cox (1955), and Bartholemew (1956). Relevant papers concerned primarily 
with the order statistics are those by Wilks (1948), Malmquist (1951), Renyi (1953) and van 
Dantzig (1954). 

A severe limitation to the use hitherto of the Kolmogorov test has been that it has only 
been available for tests of simple hypotheses. Thus it has not been usable even for the 
classical problem of testing for normality, except as an approximation, since the hypothesis 
is a composite one involving two nuisance parameters, the population mean and variance. 
In § 6 a general method is proposed for eliminating nuisance parameters thereby enbling us 
to convert a composite hypothesis into a simple hypothesis. The idea underlying the method 
is to replace sufficient estimators for the unknown nuisance parameters by values picked 
at random from appropriate distributions. The price we pay for the elimination of the 
nuisance parameters is the introduction of an extraneous element of randomization. What 
we get in return is an exact knowledge of the probability of rejection. 


2. RE-ORDERING OF INTERVALS 


Suppose we have a sample of independently and identically distributed observations 
X,,+..,%, and wish to test the hypothesis that they come from a distribution with continu- 
ous distribution function F(x). When the null hypothesis is true the values u; = F(x;) 
(j=1,...,”) are independent U(0,1) variables. This implies that the n points determined 
by w,...,U, are randomly scattered on the (0,1) interval. One would expect a departure 
from the null hypothesis to be indicated by a tendency for some of the intervals between 
adjacent points to be shorter and for some of the intervals between adjacent points to be 
longer than is found on the hypothesis of random scatter. This suggests that we take as our 
starting point a study of the relative lengths of the intervals between adjacent points. 

Denoting the ordered u’s by my, ..., Wi), these have the distribution 


dP = n'duy)...dum (0 < Uy < Wy < --. < UW < 1). (1) 


Let c, = Uy, C; = Uy —Ug_y (J = 2,...,m) and c,,, = 1—u,). Since the Jacobian is unity, 
the c’s have the distribution 


n+1 
dP = n'dc,...dc, (e1> 0,....¢ns1 2 0,  ¢,=1). (2) 
j=1 


A thorough discussion of this distribution has been given by Wilks (1948). Note that we 
avoid difficulties arising from the fact that the distribution of c,,...,c,,, is singular by 











fc 


C 











Some methods of constructing exact tests 43 


writing down the probability element for c,,...,c,; the same device is used below for 
obtaining the distribution of the quantities cq), ..., 43) aNd gy, ---5Ins1- 
Since we are interested in the relative magnitudes of the c’s it is natural to consider the 


ordered ¢’s, Cy) < Ci) <-.. < iq 41). Since there are (n+ 1)! ways of permuting n objects from 
n+ 1 the joint distribution of cy, ..., c;,) is, using (2), 


n+1 
dP = (n+1)!n! deq...de,,) (0.< ay <...<asn p> Cy) = 1). (3) 
jul 


We now transform ¢,), ..., C41) into a more manageable form by means of the trans- 
formation 


9; = (n+ 2—J) (Cy) —CG_y») (G = 90;7 = 1,...,n+1). (4) 
The Jacobian is 1/(n+1)!. Moreover, each g; > 0 and 


n+1 n+1 


XX 9I= XY G=1. 
j=1 j=1 


Consequently the distribution of g,, ...,9,, is 


n+1 
dP = n!dg,...dg, (0, > 0,.--.dnsa > 9 29 = 1). (5) 
j= 


Comparing (5) with (2) we see that the two distributions are identical. We therefore have the 
remarkable result that g,,...,9,,,,, Which depend on the ordered intervals, have the same 
distribution asthe unordered intervals ¢,, ...,C,,,,. This result provides the key tothe methods 
of test construction proposed in this and the following section. It depends essentially on the 
transformation (4) introduced by Sukhatme (1937) in the study of ordered exponentially 
distributed variables and used by Mauldon (1951) and Dwass (1959) in the study of random 
intervals, 

Putting 


ad 
WwW, = X Ij (6) 
j=1 


it follows that w,,...,w, have the same distribution as the ordered U(0,1) variables 
Uy), «++ UW)» Consequently any test procedure depending on 2%), ..., %), Such as the Kolmo- 
gorov test, has the same properties under the null hypothesis when it is based on 1, ..., , 
as when it is based on wm), ..., W,»)- 

It might be asked what has been gained by transforming to a set of values which have 
the same distribution as have the values we started with. The answer is that we hope to gain 
power. Some considerations which indicate why one would expect power to be increased in 
many situations will be discussed in the next two paragraphs. 

The type of alternative we have in mind in this paper is one in which small intervals tend 
to be smaller and large intervals larger than on the null hypothesis. A way of making this 
idea precise is the following. Let primes denote values obtained on the alternative hypothesis, 
i.€. €5, C(), gj and w; are the values on the alternative hypothesis corresponding to values 
€;, Gy, g; and w; on the null hypothesis. Suppose that whenever c; < c,; then ¢j/c; < ¢;/c¢;, 
i.e. ratios of larger to smaller intervals tend to become magnified on the alternative hypo- 
thesis. We shall show that this implies that wi < w,;(j=1,...,n). For ew < ¢g,», 80 that 


Cenlegary < Cplegin (J=1,---,m) 








44 J. DURBIN 


(we ignore the event cy = c,,,) since this has probability zero). Thus 


C CH CGanlCy—1 _ Cgsvlewm—1 —¢ 
G+ — OG) _ “G+ i)" 5 SG+wIG) _ “Gin &Y) 
, , _ , , — . 

CqHy— Cyr L—-ey_yley 1L-eg_ypley CH—eG_v 





, ‘ U ‘ 
> 
Cjv—“ H—G—D 

f , 

so that Jitt , Gi (j=1,...,), 
Jjsr 9 
i.e. g/g; is an increasing function of 7. But 

n+l ntl 
Ba So-1 


Consequently gi/g, < land g/,,/g,,, > 1. These results together imply that there is a value 
r such that g; < g; fori = 2,...,rand g; > g; fori > r where 1 < r < n+1. Since g; < g,, for 
j <1 we have 


while for j > r we have we=1- > gf <1- LY 9 =u. 


Thus w; < w; for all j. 

It is remarkable that assuming only that ratios of larger to smaller intervals become 
greater as we proceed from the null to the alternative hypothesis we have been able to show 
that all values of the w’s are diminished. In this situation the sample distribution function 
calculated from the w’s is never less on the alternative hypothesis than on the null hypothesis. 
It is unlikely that the sample distribution function of the original w’s would give as clear an 
indication of departure from the null hypothesis as that of the w’s except perhaps for special 
types of departure such as those arising from change of location, for the detection of which 
a goodness-of-fit test would not normally be employed. 

We observe in passing some important results which are, however, only of incidental 
interest for the purpose of this paper. First, we note that the quantities 


{ce; —eq}/{1 — (n+ 1) eq} (j=1,...,.2+1), 


deleting the one which is identically zero, are distributed like the intervals formed by n —1 
random points on the (0,1) interval. This follows easily by considering the conditional 
distribution of cg, ..., C41) given cq) and transforming back to the unordered c,’s. As a conse- 
quence we infer the general result that the quantities 


{c;-—c}/{1 SO «ce Oa (n+ 2—r) ce} (j= 1, ooeyg Mt 1), 
deleting the r values which are negative or identically zero, are distributed like the intervals 
determined by x —r random points on the (0, 1) interval. The quantities 
{ey — Cg_} {1 — eq — -.. —Cg_y— (M+ 3—J)ey_y}  (C=9; J=1, ---»M) 


are therefore independently distributed like the shortest of the intervals into which the 
interval (0,1) is divided by samples of n, n—1,...,2 random points, respectively. Now 
(n+ 1) cy is distributed like c, and c, has the distribution dP = n(1—c,)"—1dc, (Wilks, 1948), 











a 


> -_- fF ft fe se A of A 


a. a =u | ee 


Ww 














Ee 


Some methods of constructing exact tests 45 


whence {1 — (n + 1) (}" is a U(0, 1) variable. From this we deduce the general result that the 
quantities 





{r- (n+ 2—J) (gy —¢g_») 
1-—Gy- ose — Cy_9) — (n+3—J) Cy_ 


are independent U(0, 1) variables. 


n+1-—j ’ 
| (= 0; j=1,...,) 


3. SoME NEW TESTS OF GOODNESS-OF-FIT 
From (4) and (6) we have 


W; = Cy t+... HCg_py + (n+2—J) Cy (j=1,...,%), (7) 


where Cy) <... < C+» is the ordered set of intervals. The goodness-of-fit tests proposed in 
this section are all tests of the hypothesis that w, ...,w,, are distributed as ordered U(0, 1) 
variables. Large numbers of such tests could be devised; here we mention only three which 
in addition to being exact have special features of interest. The first is included on account 
of its simplicity and because it requires only the widely available tables of the F-distribution. 
We call it the modified median test since it is based on the median of the w’s. Consider the 
distribution of w,. The probability element required is the probability that in a sample of n 
uniform variables r — 1 are less than w,, one is in the range w,, w,+dw, and n—r are greater 
than w,, i.e. 


n! 
7" ge Te SSS F (8) 
Thus Bib ee i=%, 


has Fisher’s variance-ratio distribution with 2(n + 1—71), 2r degrees of freedom. The value of 
w, for any particular r can be used as a test statistic, r being at our disposal. An obvious 
choice is the median value, i.e. r = 4(n + 1) for n odd and r = 3n (or 4n+ 1) for n even. Since 
on the alternative we would expect w, to be smaller than on the null hypothesis a one-sided 
F-test is appropriate. 

More important is the second of the three tests which we call the modified Kolmogorov test. 
This is obtained by considering the difference between the sample and population distribu- 
tion functions corresponding to w,, ...,w,. The test statistic is 


K,,= men (-- w) , (9) 


Since we expect K,,, to increase as we depart from the null hypothesis a one-sided test is 
appropriate. Consequently the test procedure is to reject when K,,, is greater than the value 
tabulated for a one-sided Kolmogorov test. A suitable table has been given by Miller (1956). 


n n n 
The third test comes from the observation that [J] ™)= [] u;. Since [] w, is distri- 
j=l j=l j=1 
n n 
buted like pus uw, it therefore has the same distribution as i u;. Karl Pearson (1933) 


pointed out that pi u, is distributed like exp (—4x*), where x? has the x? distribution with 
2n degrees of hedem, and hence can be used as the statistic for an exact test of goodness- 








46 J. DURBIN 


n 
of-fit. Applying Pearson’s suggestion te [] w,;we obtain a third exact test which we call the 
j=1 
modified probability product test. The test statistic is 


P,, = —2log qt W; (10) 
j= 


which is tested as a y* variate with 27 degrees of freedom. 

The power of the original probability product test was studied by E. 8. Pearson (1939) 
and in the light of his results it seems likely that the P,, test will in theory have high power 
against a wide class of alternatives. However, its use in practice in the form presented above 
is not recommended since the value of the statistic P,, seems to be too much affected by 
rounding errors in U,...,U%,- It is possible that the test might be altered slightly so as to 
avoid this difficulty but the best way of doing so is not obvious. 

Some empirical results obtained by applying the first two tests to artificial data are given 
in the next section. 


4. SOME EMPIRICAL RESULTS 


The performance of the JZ, and K,, tests proposed in §3 has been compared with the 
performance of the x? and ordinary Kolmogorov tests on five samples of 50 observations 
from each of the following three distributions, 


Exponential: exp{—(x+1)}dx (a> —1). (11) 
Laplace: 2-texp(—2-4|a|)da (—w<x<oo). (12) 
Normal: (27)-t exp (—4a?)dx (—ao<2<oo). (13) 


Each distribution has mean zero and variance unity and the null hypothesis is that the true 
distribution is the N(0, 1) distribution defined by (13). Random deviates from these distri- 
butions have been tabulated by Quenouille (1959) and the five sets of three samples in this 
experiment are the values given on the first five pages of Quenouille’s tables. Deviates for 
distributions (11) and (12) were obtained by Quenouille by transformation of the corre- 
sponding deviates from distribution (13); consequently the samples from the three distribu- 
tions correspond in the sense of being composed of values with the same probability integral 
transforms. 

The following four tests were applied to each sample of 50 observations: 

(a) The x? test, taking ten groups with an expected number of five in each group. 

(b) The two-sided Kolmogorov test based on the statistic 

K= max |S(«;)—F(x;,)|, 
J=1, ..05% 

where S(x) is the sample distribution function and F(x) is the hypothetical distribution 
function. Significance was assessed by referring to the tables of critical values of this statistic 
published by Miller (1956). 

(c) The M, test proposed in §3 taking r = 25. 

(d) The modified Kolmogorov test based on the statistic K,, defined by (9). 

The results are given in Table 1. The entries are values of the appropriate test statistics. 
Significance at the 5 and 1% levels are denoted by single and double asterisks, respectively. 








~~ oS — SS 


uy 


Some methods of constructing exact tests 47 


With due reservations owing to the small number of samples considered, the results 
suggest the following tentative conclusions for the sample size and alternatives considered: 
(i) The unmodified Kolmogorov test is less powerful than the y? test. 
(ii) The modified median test is more powerful than the unmodified Kolmogorov test but 
slightly less powerful than the x? test. 
(iii) The modified Kolmogorov test is more powerful than the unmodified Kolmogorov 
test, slightly more powerful than the modified median test and about as powerful as the x? 








test. 


Table 1. Comparison of performances of new and existing tests on artificial samples 





Sample 
in ia ‘ 
Population Test 1 2 3 4 5 
Exponential — 17-2* 24-0** 22-8** 38-0** 24-8** 
K 0-16 0-18 0-18 0-26** 0-21** 
M,, 2-20** 1-71* 2-48** 2-44** 1-61* 
Bring 0-23** 0-22** 0-25** 0-22** 0-18* 
Laplace beg 26-8** 13-5 12-0 19-6* 18-4* 
K 0-19* 0-15 0-10 0-14 0-14 
M;; 2-38** 1-23 1-19 0-91 1-36 
Kn 0-25** 0-17* 0-08 0-17* 0-16 
Normal = 7-2 13-7 8-8 7-2 8-8 
K 0-17 0-10 0-15 0-14 0-10 
M,; 1-62* 1-24 0-93 0-70 1-21 
x. 0-16 0-12 0-05 0-12 0-08 
The critical values of the four statistics are: 
r K M3; KS, 
5% 16-92 0-188 1-609 0-170 
1% 21-67 0-226 1-967 0-211 


In making these comparisons with the x? test it should be remembered that the y? test is 
not an exact test whereas the other three tests are exact. There is no suggestion that the y? 
test should be used as an absolute standard; it is merely a convenient yardstick which is used 
because of familiarity among statisticians. Moreover, I would like to make it clear that in 
presenting these results it has not been my intention to make any general statements 
regarding the relative powers of the various procedures; the intention has been merely to 
summarize the performance of the tests under the conditions of this experiment. 


5. APPLICATIONS TO TIME-SERIES AND ARRIVAL-TIME DISTRIBUTIONS 


A further application of the foregoing theory is to testing the hypothesis that a series of 
identically distributed normal variables have uniform spectral density, i.e. are serially 
independent. In the spectral approach to time-series the basic statistic is the periodogram 


1 r i 2 . 
DP; = oT 2a [(27ijt)/T} (j=0,1,...,47 for T even), 


(j=0,1,..., (7-1) for T odd). (14) 





48 J. DuRBIN 


Let m = 37 —1 for T evenand m = }(7'—1) for 7 odd. On the null hypothesis that 7,, ..., xp 
are independent N(y,0*) variables the quantities 27p,/o? (j=1,...,m) are independent 


™ 
exponential variables with density exp(—-). Putting c; = p, / xD PM; (j=1,...,m) we have 
i=1 


the well-known result that c,,...,¢,, are distributed like the intervals between successive 
ordered U(0, 1) variables, i.e. they have the distribution (2) with n+ 1 = m. Transforming 
tO (4), «-+5 Cm)» then to J, .-.,Jm— and finally to w,, ...,w,,_, a8 in §2, we have reduced the 
problem to testing the hypothesis that w,, ...,w,,_, are ordered U(0, 1) variables. For this 
purpose the modified Kolmogorov test recommended in §3 is appropriate. 

It might be asked what advantage this test is likely to have over existing exact tests such 
as the von Neumann test (1941) based on the statistic 


T T 
St = TS (my —m) / (T—1) ¥ (@-2), 
t=2 t=1 


Fisher’s test (1929) based on the largest of the c,’s, or indeed the ordinary Kolmogorov test 
applied to the unordered c’s. The answer is similar to that given for the goodness-of-fit 
problem in §3, namely, that we expect it to have high power against a wider range of 
alternatives. 

The method applies also in a rather obvious way to testing the hypothesis that events are 
occurring in a Poisson process. Suppose that one observes a Poisson process for a fixed 
length of time 7’ and that n events occur at times ¢,, t., ...,¢,, during the interval (0, 7’). Let 
uy =t,/T (j=1,...,n). Then wp, ...,%) are distributed in the form (1). Alternatively, 
suppose one fixes n in advance and starting at time zero, records the times ¢,, ...,t,,, at 
which n+ 1 events occur. Let uw, = t;/t,,, (J=1,...,n). Then wy, ..., %) are distributed in 
the form (1). Simple proofs of these results are given by Epstein (1960, Appendix 1). In 
either case one can transform to the c’s, the g’s and the w’s as in §§ 2 and 3 and employ the 
modified Kolmogorov test in the way described. 


6. TESTS BASED ON SYSTEMATIC SUBSAMPLING OF THE OBSERVATIONS 


Most of the work required to carry out the goodness-of-fit tests described in §3 is likely 
to occur in making the probability integral transformation from the 2’s to the w’s; indeed 
when the sample is large the amount of labour required can be prohibitively great. A way of 
reducing the work substantially is to confine the analysis to a systematic subsample of the 
observations. Suppose that the sample size is now denoted by NV. Instead of transforming all 
the x’s we first arrange them in order of magnitude, giving 2) < 2% <... < %y say, and then 
transform every kth, i.e. we calculate uy) = F (xg), Woy = F(Xey), ---> Unw = F(X), Where 
k 1s a suitable chosen integer and n is the largest integer satisfying nk+k—1 < N. 

Suppose for simplicity that N = nk+k—1; otherwise reject N —nk—k+1 observations 
at random. The distribution of uy), ..., %,,) is (see, for example, Wilks, (1948)) 


nk+k—1)! ,_,%! 
dP = e Pa Uli TL (Ugn+w— Ugiy)*-} (1 — Ueyy)*-1 dugy ... dling 





j=1 
(0 < Way < Uy < --- < May) <1). (15) 
Let P; = “es ‘ 
Ugnewy (j=1,...,.n—1). (16) 


Pn = Uap 








Th 


Th 


Th 


(1! 





2oR o 


EE OEE ORE PS ESO 





Some methods of constructing exact tests 49 
The Jacobian is O(4y,),---»>Uinw)  % 
OPay Pn) jan “0% 

The distribution of p,, ..., p, is, therefore, 


nk+k—1)! 
eee pPi-*(1 — p,)* p}*-1(1 — p,)**... pre-1(1 — p,)* 1 dpy, ..., dpa 


(0<p,;<l,allj). (17) 


dP = 


Thus 9, ...,p, are independently distributed, the distribution of p,; being of the Beta form 
dP = {B(jk, k)}* pi*-*(1 — p,)* dp. (18) 
Let z; denote the probability integral transform of ,, i.e. 
- 
z; = {B(jk, mf * gik-1(] —ax)F-1dzx (j=1,...,n). (19) 
0 


Then 2, ...,Z, are independent U(0, 1) variables. The actual values of the z’s determined by 
(19) may be obtained from the Tables of the Incomplete Beta Function (K. Pearson, 1948). 


it q=29 (j=1,...,n). (20) 
The joint distribution of q,, ...,q,, is seen to be 
dP = n! 9293...n* dqy...dq, (0 < q;<1, all J). (21) 
This is of the form (17) with k = 1. Let 
Saag IT de. (22) 


We observe that q; = v;/v;,, and q, = v,. Consequently the distribution of 1,, ...,v,, is given 
by putting & = 1 in (15), i.e. 


dP = n!dn,...dv, (O<v,<vg<...<v, <1). (23) 


It follows that v,, ...,v, are distributed as ordered U(0, 1) variables. 

We have transformed the systematic subsample obtained by taking every kth value into 
a sample of ordered © ‘0, 1) variables and may therefore now apply to 1, ..., v,, the methods 
developed in §§ 2 and 3 for analysing w), ..., %) . Of course a substantial amount of com- 
putation is required in order to proceed from wy), ..., Uz) tO V,, ..., ¥,- Whether or not this is 
worth while depends on how much computation is saved by analysing a systematic sub- 
sample instead of the whole sample. The procedure clearly involves some loss of power but 
[have not attempted to assess how great this is likely to be. Note that the same method can 
be employed to construct an exact test for equality of variances; however, in view of the 
amount of computation required the idea does not seem worth elaborating upon in detail. 


7. THE ELIMINATION OF NUISANCE PARAMETERS BY THE METHOD 
OF RANDOM SUBSTITUTION 
A serious limitation to the practical usefulness of the Kolmogorov test has been that there 
has not been available a simple method of allowing for the presence of nuisance parameters. 
For example, it has not been possible to use it to obtain an exact test of normality in the 
usual situation where the population mean and variance are unknown. The reader is 


4 Biom. 48 





50 J. DURBIN 


referred to the paper by Kac, Kiefer & Wolfowitz (1955) for a thorough discussion of the 
difficulties involved. A similar limitation applies to t. .. tests proposed in §3. Consequently, 
if we are to bring these tests to the point of practical usetu!uess for the classical goodness-of- 
fit problem we require a practicable method of dealing with nuisance parameters. 

Asimple randomization method for doing this will now be presented. Although the method 
is of wide application it is convenient to introduce it by applying it to the specific problem of 
testing for normality. 

Suppose we wish to test the hypothesis that 2,, ..., xz, are independent observations from 
a normal distribution with unknown mean y and unknown variance o?. Let % = n—Za; and 
s? = (n—1)-!X(x;—2%)*; furthermore let 2’, s’* be observations of random variables inde- 
pendent of x,,...,z, and distributed as the sample mean and variance calculated from a 
sample of n independent observations from a N(0, 1) distribution, i.e. Z’ has the distribution 


dP = constant x exp (— 4nz’?) dz’ (24) 
and s’* has the distribution 
dP = constant x (s’2)¢"—Hexp { — 4(n— 1) 8’"} ds. (25) 
Define 2x}, ...,2), by the relations 


“—-% 4-zk CC. 
+ = == 1, ...,%). 26 
5 = i=l...) (26) 





We shall show that 2},...,2), are independent N(0,1) variables. Any of the exact tests 

previously discussed can now be applied to 2},...,z/. We have, in fact, transformed the 

composite hypothesis concerning 2, ...,x,, into a simple hypothesis concerning 2}, ..., X;.. 
The proof is as follows. For the distribution of 2, ...,z,, we have 


dP = constant x exp = z (x;— ws dz, ...d_. (27) 
i=1 

Let l, = “a er | (28) 
r—1\* Be, 

m, = (‘> (.,- rr, x i) (r=2,..., 2); (29) 


Ms = (n—1) cosa, cOsdy... COSA, 9, 
m, = (n—1)' cosa, cosa,...cosa,_,8ind, ,,, (r=3,...,n—1), (30) 


m,, = (n—1)*sina,. 


(29) and (30) are the Helmert and polar transformations. Substituting in (27) we have the 
classical representation of the probability element of a normal sample in terms of %, s? and 
the angular variables a, ...,@,,_», i.e. 


oa 


dP = constant x exp B i (%— )?) 8” exp { — 3(n— 1) s?/o?} 


| 


x cos" a, cos"-4a,... cosa,_,d%ds*da,...da,_». (31) 


(Geary, 1933; see also Kendall & Stuart, 1958, pp. 247, 250 for details of the derivation). 








Co1 


(K 








Some methods of constructing exact tests 51 


Consequently %, s? and the set of angles a,,...,a,_. are independently distributed with 
distributions 


dP = constant x exp (a (Z-— n)| dz, (32) 
dP = constant x s"-* exp { — 4(n — 1) s?/0?} ds?, (33) 
dP = constant x cos”~* a, cos"~* ag... cS a,,_3da,...da,,_», (34) 


respectively. 
More important from the present point of view is the converse result which for the sake of 
clarity we state in the form of the following lemma. 


Lemma. Suppose %, s*, a, ...@,,_.are independent random variables with distributions (32), 
(33) and (34). Let 2,,...,”, be determined by (28), (29) and (30) together with the relation 
n 
x J, = 0. Then 2,,...,2, are independent N(y, 0?) variables. 
i=1 


To prove this, suppose that s? and a, ..., a,» are independent with distributions (33) and 
‘34), respectively. Let 
Yo = (n—1)* 8 cosa, cOSA,... COSA, 9, 
Y, = (n—1)? cosa, cOsdg...COSA,_,SINAy _»4, (r=3,....n—1), (35) 
Yn = (n—1)¢ssina,. 
The Jacobian is 


O(Ys, --+» Yn) 
0(6*, a, ..., Bas) 


(Kendall & Stuart, 1958, p. 247). On substitution we find for the distribution of yo, ..., y, 





= 4(n—1)#™—D g"-3 cog"-3 g, cos"4 ay... COSA, 3 


n 
dP = constant x exp (- a > i) dys... dYn- 
i=2 


Suppose % independently has distribution (32). Let x,, ...,2,, be determined by 


z 2, = ne, (36) 
(' (= Ea)=a (=m). (37) 


On substituting in the joint distribution of % and yp, ..., y, we find that x,,...,7, have the 
distribution (27). It remains to show that (35), (36) and (37) are equivalent to (28), (29), (30) 


and the relation > 1; = 0; the equivalence follows on summing both sides of (28) and on 
i=1 


multiplying (30) through by s. 
Suppose now that 2,,...,2, have distribution (27) and that a,,...,a,_, are determined by 
(28), (29) and (30). Let 2’ and s’* have distributions (24) and (25) independently of x,, ..., 7. 


n 
Suppose that/l,, ...,1,,are determined from a, ...,a,_, by (29), (30)andtherelation ¥ 1; = 0. 
i=1 


Let x}, ...,#,, be determined by 





(¢=1,...,%). (38) 


4-2 





52 J. DURBIN 


By the lemma it follows that x}, ..., 2), are independent N(0, 1) variables. We have therefore 
established the claim that composite hypotheses concerning 2, ..., x, can be tested as simple 
hypotheses concerning 2}, ...,2;,. 

I want to put on record the fact that my original geometrical derivation of this result was 
much shorter than that presented here. However, a number of my friends were so sceptical 
about either the result itself or the argument supporting it that I felt compelled to supply 
full algebraical details, even though they might seem rather tedious (see also Prof. Pearson’s 
note on p. 55). 

The method is clearly of wide application. Suppose in general that 2,,...,7, have a 
distribution with distribution function F(2,, ...,x,, 0) depending on a set of parameters 6 for 
which a sufficient set of statistics t, isavailable. Suppose that a transformation 7' independent 
of 6 carrying 7,,...,2,, into (t,,t,) exists, t, being another set of statistics distributed inde- 
pendently of ¢,, such that the inverse transformation 7’! carries (¢,,/,) into 7, ...,%, 
uniquely. Let G(t,,0) denote the distribution function of ¢, and let ¢; be an observation of a 
random vector independent of ,,..., x, with distribution function G(t;,0,) where 0) is a 
known value of 0. Let 2}, ...,2), be the values obtained by applying 7'- to (t;,¢,). Then 
X},...,x, are distributed with distribution function F(x}, ...,7),0)). As for the lemma, the 
proof follows by merely reversing the transformation. 

To illustrate the use of this general result consider the construction of an exact distribu- 
tion-free test of serial correlation. The null hypothesis is that x,, ..., x, are independently and 
identically distributed with an unknown continuous distribution function. Let x;,, %;,,...,%;, 
be the set of x’s arranged in increasing order of magnitude; i,, , ...,7,, is then a permutation 
of the suffixes 1,2,...,n. Let Xj,,,Xj,,---,%;, be a sample of n independent N(0, 1) variables 
taken from a table of random normal deviates and arranged in increasing order of magnitude. 
The arrangement of the suffixes is taken to be the same as on 2;,,2;,,...,%;,- Calculate von 
Neumann’s statistic n 
82 n 2, (2; a %_4)* 


8 ” wa 
(n—1) > (4-7? 
t=1 
1 2 
where Z=- > 
N t=1 


and refer to tables of critieal values of this statistic (Hart, 1942) 

The key to the construction of this test is the observation that the order statistic 
%;,,%;,,---,%;, is a sufficient statistic for the unknown distribution function which, when the 
null hypothesis is true, is distributed independently of the permutation i,,...,7,, of the 
suffixes 1, 2,..., m (see Fraser, 1957). Other possible applications which spring to mind are 
the testing of bivariate and multivariate normality and the testing of serial correlation in 
least-squares regression. 

The price that has to be paid for the elimination of nuisance parameters by this method is 
that an element of randomization is introduced in the analysis of the data. The device can be 
objected to on the ground that it permits different investigators to draw different conclusions 
from the same set of data. The reader is referred to the paper by E. 8S. Pearson (1950) for a 
searching discussion of the questions this raises. Some statisticians object to the use of 
randomization at any stage, design or analysis, as a matter of principle; others feel that 
randomization is legitimate in the design of an experiment but not in the analysis of the 




















Some methods of constructing exact tests 53 


results. I do not wish to take up space on this occasion by discussing my personal attitude to 
randomization. I do, however, wish to point outan important operational distinction between 
randomization in design and randomization in analysis. In design an act of randomization 
is performed once and for all, the experimental layout being arranged according to what turns 
up in the randomization procedure. Since the experimental results achieved depend on the 
result of randomization, it is hard for the investigator to see how randomization has affected 
his conclusions without repeating the entire experiment. And even if he were to do this, 
any differences which emerged could just as well be due to changes in other conditions of 
the experiment as to a change in the result of the randomization procedure. 


Table 2. Some results for random-substitution tests of normality 


Random pair 2%’, s”” 





c Ramee mm 
Population Test 1 2 3 4 5 
Exponential x? 21-2* 26-0** 21-6* 34-4** 22-8** 
K 0-160 0-227** 0-271** 0-288** 0-181 
M;,; 2-44** 2-22** 2:23** 2-61** 2-43** 
Kn 0-263** 0-274** . 0-278** 0-295** 0-294** 
Laplace bse 12-0 25-2** 14-8 22-0** 22-4** 
K 0-163 0-190* 0-191* 0-256** 0-156 
M,; 2-42** 2-66** 2-12** 2-72** 2-72** 
Kn 0-222** 0-242** 0-209* 0-240* 0-239** 
Normal a 8-4 6-0 13-6 11-6 8-8 
K 0-126 0-138 0-190 0-100 0-120 
M2; 1-75* 1-45 1-76* 1-45 1-39 
i. 0-129 0-136 0-147 0-143 0-143 


At the analysis stage, however, the position is different. The experimental results are a 
fixed set of numbers and the investigator can repeat the randomization procedure as often 
as he likes without affecting them. Consequently if he wishes to know the extent to which his 
conclusions are affected by the result of randomization he merely has to repeat the analysis 
after a fresh act of randomization. This can be done as often as the investigator wishes. Of 
course, in order to preserve the exact probabilities of his test he must abide by the results 
indicated by his first act of randomization. 

As an illustration, the first sample of 50 for each of the three distributions considered in 
§4 was tested for normality assuming the population mean and variance to be unknown. 
The sample means and variances were replaced by values 2’, s’? picked at random. For z’ the 
quantity 1/,/50 times a random N(0, 1) deviate was usea and for s’? the quantity 


24 
(1/49) (25 e; batt 4s) 


was used, where ¢,, ..., €94 are random exponential deviates with distribution (11) while zisa 
random N(0, 1) deviate; all these random deviates were taken from Quenouille’s tables. The 
hypothesis that the values 2}, ..., Zj9 determined by (38) were independent N(0, 1) variables 
was tested by each of the four tests considered in §4. In order to explore the effect of 
randomization the tests were repeated five times using a new pair of randomly chosen 
values %, s’? on each occasion. The results are given in Table 2. Entries in the table are values 
of the appropriate test statistics. Significance at the 5 and 1% levels are denoted by 
single and double asterisks, respectively. 








54 J. DuRBIN 


It has to be admitted that the amount of variation from random pair to random pair is 
disappointing for both the x? test and the ordinary Kolmogorov test. The reason is pre- 
sumably the strong dependence of the test statistics on the sample mean and variance. On 
the other hand the amount of variation for the modified median and modified Kolmogorov 
tests is substantially less, indicating that the effect of the re-ordering of the intervals has 
been to reduce the dependence of the test statistics on the randomly substituted values. 


I am indebted to D. E. Barton, M. G. Kendall and A. Stuart for reading the manuscript 
and for some heipful comments. 


REFERENCES 


BARNARD, G. A. (1953). Time intervals between accidents—a note on Maquire, Pearson and Wynn’s 
paper. Biometrika, 40, 212-3. 

BARTHOLEMEW, D. J. (1956). Tests for randomness in a series of events when the alternative is a 
trend. J. R. Statist. Soc. B, 18, 234-239. 

BARTHOLEMEW, D. J. (1957). Testing for departure from the exponential distribution. Biometrika, 
44, 253-6. 

Barton, D. E. & Davin, F. N. (1955). Sums of ordered intervals and distances. Mathematika, 2, 
150-9. 

Barton, D. E. & Davin, F. N. (1956a). Tests for randomness of points on a line. Biometrika, 43, 
104-12. 

Barton, D. E. & Davin, F. N. (19566). Some notes on ordered random intervals. J. R. Statist. Soc. B, 
18, 79-94. 

Cox, D. R. (1955). Some statistical methods connected with series of events. J. R. Statist. Soc. B, 
17, 129-57. 

Daruine, D. A. (1953). On a class of problems relating to the random division of an interval. Ann. 
Math. Statist. 24, 239-53. 

Daruine, D. A. (1957). The Kolmogorov-Smirnov, Crarmer—von Mises tests. Ann. Math. Statist. 
28, 823-38. 

Dwass, M. (1959). The distribution of linear combinations of random divisions of an interval. O.N.R. 
Research Memorandum No. 21 (unpublished). Systems Research group, Northwestern University. 

Epstern, B. (1960). Tests for the validity of the assumption that the underlying distribution of life 
is exponential, Part I. Technometrics, 2, 83-101. 

Epstein, B. & SoBEL, M. (1954). Some theorems relevant to life-testing from an exponential distri- 
bution. J. Amer. Statist. Ass. 25, 373-81. 

Epstetn, B. & Tsao, C. K. (1953). Some two-sample tests based on ordered observations from the 
exponential distribution. Ann. Math. Statist. 24, 458-66. 

FisHer, R. A. (1929). Tests of significance in harmonic analysis. Proc. Roy. Soc. A, 125, 54-9. 

Fraser, D. A. 8. (1957). Non-parametric Methods in Statistics. New York: Wiley. 

Geary, R. C. (1933). A general expression for the moments of certain symmetrical functions of 
normal samples. Biometrika, 25, 184-6. 

GREENWOOD, M. (1946). The statistical study of infectious diseases. J. R. Statist. Soc. A, 109, 85-103. 

Hart, B. 1. (1942). Significance levels for the ratio of the mean-square successive difference to the 
variance. Ann. Math. Statist. 13, 445-7. 

Irwin, J. O. (1955). A unified derivation of some well-known frequency distributions of interest in 
biometry and statistics. J. R. Statist. Soc. A, 118, 389-98. 

Kac, M., Krerrer, J. & Worrowrrz, J. (1955). On tests of normalit;: and other tests of goodness of 
fit based on distance methods. Ann. Math. Statist. 26, 189-211. 

KENDALL, M. G. & Stuart, A. (1958). The Advanced Theory of Statistics, vol. 1. London: Griffin. 

Macurre, B. A., Pearson, E. 8. & Wynn, A. H. A. (1952). The time intervals between industrial 
accidents. Biometrika, 39, 168-80. 

Macurre, B. A., PEARSON, E. 8. & Wynn, A. H. A. (1953). Further notes on the analysis of accident 
data. Biometrika, 40, 213-16. 

Matmauist, 8. (1951). On a property of order statistics from a rectangular population. Skand. 
Aktuar. Tidskr. 33, 214-22. 

Mautpon, J. G. (1951). Random division of an interval. Proc. Camb. Phil. Soc. 47, 331-6. 








MILLER, | 
51, 111 
Moray, I 
Moray, | 
NEYMAN, 
PEARSON 
PEARSON 
drawn 
randon 
PEARSON 
bining 
PEARSON 
bution 
QUENOU! 
46, 17% 
REnyI, / 
SHERMA? 
21, 33: 
SUKHAT) 
referer 
SUKHAT} 
freedo 
WHITWO 
WItkKs, § 
van Da 
the In 
von NE! 
variar 


If we 


where Z’ 
1/nand 
member 


E. 8S. Pe 


At fir 
respecti 
xj is to} 
it is fou 
xj is an 

Mr D 
that Is 
about tl 








Some methods of constructing exact tests 55 


Mitter, L. H. (1956). Tables of percentage points of Kolmogorov statistics. J. Amer. Statist. Ass. 
51, 111-21. 

Moran, P. A. P. (1947). The random division of an interval, I. J. R. Statist. Soc. B, 9, 92-8. 

Moran, P. A. P. (1951). The random division of an interval, II. J. R. Statist. Soc. B, 13, 147-50. 

NEyMAN, J. (1937). ‘Smooth test’ for goodness of fit. Skand. Aktwar Tidskr. 20, 149-99. 

Pearson, K. (1948). (Ed.) Tables of the Incomplete Beta Function. London: Biometrika Office. 

Prarson, K. (1933). On a method of determining whether a sample of size n supposed to have been 
drawn from a parent population having a known probability integral has probably been drawn at 
random. Biometrika, 25, 379-410. 

Pearson, E. 8. (1939). The probability integral transformation for testing goodness-of-fit and com- 
bining independent tests of significance. Biometrika, 30, 134-48. 

Pearson, E. 8. (1950). On questions raised by the combination of tests based on discontinuous distri- 
butions. Biometrika, 37, 383-98. 

QUENOUILLE, M. H. (1959). Tables of random observations from standard distributions. Biometrika, 
46, 178-202. 

Reny1, A. (1953). On the theory of order statistics. Acta math. hung. 4, 191-231. 

SHERMAN, B. (1950). A random variable related to the spacing of sample values. Ann. Math. Statist. 
21, 339-61. 

SuKHATME, P. V. (1936). On the analysis of k samples from exponential populations with special 
reference to problems of random intervals. Statist. Res. Mem. 1, 94-112. 

SuKHATME, P. V. (1937). Tests of significance for samples of the x? population with two degrees of 
freedom. Ann. Eugen., Lond., 8, 52-6. 

Wuitworth, W. A. (1887). Choice and Chance, 3rd ed. Cambridge University Press. 

Wis, 8. 8. (1948). Order statistics. Bull. Amer. Math. Soc. 54, 6-50. 

van Dantzia, D. (1954). Mathematical problems raised by the flood disaster 1953. Proceedings of 
the International Congress of Mathematicians, 1954, vol. 1, 218-39. Amsterdam. 

von NEUMANN, J. (1941). Distribution of the ratio of the mean-square successive difference to the 
variance. Ann. Math. Statist., 12, 367—95. 


EDITORIAL NOTE 
If we write u; = (x;—%)/s, Mr Durbin’s equation (26) may be written 
x = @’+8'u;, (1) 

where 2’, s’ and wu; are three independent random variables, the first distributed normally with variance 
1/n and the second as y,,_,/,/(n — 1). The n values of u, are of course not independent, but if x;is arandom 
member of the original sample it is known (W. R. Thompson, 1935, Ann. Math. Statist. 6, 214 and 
E. S. Pearson & C. Chandra Sekar, 1936, Biometrika, 28, 308) that the p.d.f. of wu; is 
n ) }(n—4) a1 ee | 

_ 4/2 ee 

y2 t ’ 


p(u;) = constant x (: as mal 





i He (2) 
Vn vn 

At first sight it might appear unlikely that the product, s’u; of independent variables following, 
respectively, a y and a symmetrical / distribution would be normally distributed, as they must be if 
xj is to be normally distributed. However, on performing the necessary transformation and integration 
it is found that s’u; is distributed normally about zero with variance (n—1)/n. It follows at once that 
a; is an N(0, 1) variable. 

Mr Durbin’s proof in terms of random angles is clearly the more fundamental, but he has suggested 
that I should add this note since several readers of his paper in typescript had been puzzled, as I was, 
about the distribution of s’u,. E.8.P 











Prini 








Biometrika (1961), 48, 1 and 2, p. 57 57 
Printed in Great Britain 


Preemptive priority queueing 


By C. R. HEATHCOTE 
Australian National University 


SUMMARY 


The basic queueing model considered is one in which certain customers are given a pre- 
emptive right to service over routine, non-priority, customers. Servicing of the latter is thus 
liable to interruption by the arrival of a priority customer. Given a model of this kind we 
require the distribution of queue length for the non-priority class of customers. The service 
time distribution of the non-priority customers is assumed to follow x? with an even number 
of degrees of freedom. This includes the negative exponential and constant service time 
distributions as special cases. The method of generating functions is used to obtain a solution 
in the equilibrium state and, as a Laplace transform, for finite time. Breakdowns in the 
service mechanism are interpreted as a modification of the preemptive discipline. 


1. INTRODUCTION 


A problem of interest in the practical application of the theory of queues is that of the 
effect on queue behaviour of interruptions to the servicing of customers. One such case is 
when breakdowns in the service mechanism add to the usual delay. Alternatively the inter- 
ruptions may be caused by a queue discipline which assigns priority to a certain group of 
customers. For example, in a communication system a priority customer could be an urgent 
message to be transmitted immediately on arrival irrespective of the state of the queue of 
routine (non-priority) messages. Under this discipline the arrival of a priority customer and 
a breakdown in the service mechanism are equivalent from the point of view of the customer 
whose service has been interrupted. A model combining both priority discipline and break- 
downs can be postulated, in which case a breakdown is formally interpreted as a priority 
customer with precedence over all others. We are interested in finding to what extent inter- 
ruptions such as these affect the distribution of queue length. 

Queueing models in which service interruptions are a distinguishing feature are termed 
preemptive priority systems in distinction to the priority discipline in which a priority 
customer proceeds to the head of the waiting line on arrival, but waits until the service of 
the current customer has ended. The preemptive priority model may be described as follows. 
A service facility caters for a population of customers divided into R priority classes, 
R = 2,3,.... (The case R = 1 corresponds to no priorities.) Label the classes serially in order 
of precedence 1, 2,...,. Then on arrival a customer of class 1 commences service im- 
mediately provided no members of classes 1, 2, ...,7, are present. If customers belonging to 
any of these classes are present then the new arrival joins the queue of members of the same 
priority class. The servicing of customers of class i does not commence until the system is 
empty of customers of classes 1, 2,...,i—1. Further, a rule must be given governing the 
manner in which the service of an interrupted customer is to be resumed when the inter- 
ruption ceases. These are the assumptions specific to preemptive queueing. We also assume 
that the queue discipline within each class is ‘first come, first served’, and that all inter- 
arrival and service times are independently distributed. The inclusion of breakdowns in this 





58 C. R. HEATHCOTE 


description implies that times between their occurrence and repair times are equivalent to 
inter-arrival and service times, respectively. 

Preemptive queueing was apparently first considered by Barry (1956), and in more detail 
by White & Christie (1958) and Stephan (1958). These authors discussed the steady-state 
model for a single server negative exponential queue with two priority classes. Heathcote 
(1959) obtained the joint generating function of the temporal probabilities for this queue. 
Subsequently these results were generalized to the case of R priority classes (Heathcote, 
1960), but the assumption of negative exponential inter-arrival and service times was 
retained. In this paper the preemptive priority problem treated is that for a single server 
with two priority classes where the service time of the non-priority customers follows an 
Erlang distribution (x? with an even number of degrees of freedom). A particular case 
considered is that when this service time is of constant duration. We assume that the inter- 
arrival times and the service time of priority customers follow negative exponential distribu- 
tions. Models with more general service and inter-arrival distributions can be solved by the 
methods used here, at least in principle, but the above model is probably general enough 
for most practical purposes. 

The question of the rule under which a customer recommences service after interruption 
has been discussed by White & Christie (1958). They draw a distinction between ‘preemp- 
tive resume’ and ‘preemptive repeat’. By the former is meant that service reeommences 
at the point it had reached when the interruption occurred. In contrast the ‘preemptive 
repeat’ rule means that service commences from the beginning every time the customer 
re-enters service. Provided the service distributions are negative exponential there is no 
difference mathematically between these two rules because of the special properties of the 
negative exponential distribution. Since we are concerned with a distribution which does 
not in general possess these special properties, we will assume that the ‘preemptive 
resume’ rule holds. The equations governing the ‘preemptive repeat’ rule are rather more 
complicated. 

The preemptive priority model may be considered mathematically as a multidimensional 
random walk of a restricted nature. For the two-dimensional case discussed here the state 
of the system at time ¢ is described by the vector (r,); r, n = 0, 1, 2,..., where r and n 
denote respectively the number of priority and non-priority class members present in the 
system. The co-ordinates vary discretely as customers arrive or depart, and the random walk 
takes place on the lattice points of the positive quadrant in the (r,n)-plane. The restricted 
nature of the random walk is due to the fact that ifr > 0, n can only increase since the service 
mechanism has been preempted by the priority class. It is this special feature of the problem 
that permits a solution to be found by the straightforward use of generating functions. 
Further restrictions natural to the process have to be imposed. For the queueing problem 
itself, reflecting barriers are imposed along the r and n axes. If first passage times were of 
interest, such as the duration of busy periods, then these barriers would have to be modified 
to include an absorbing barrier at the point of interest. The breakdown model of §3 is an 
example where, in addition, other boundary conditions are required. 


2. PREEMPTIVE PRIORITY MODEL 


It is well known that the Erlang distribution may be considered as the sum of a number, 
K, say, of negative exponential phases, and many problems involving this distribution are 
solved in terms of phases, not customers, although it is the latter that are of major interest. 














Preemptive priority queueing 59 


The equations governing transitions between phases are much simpler than those for num- 
bers of customers, but because of its interest, we solve the problem in terms of the latter. 
Let the service times of the priority and non-priority customers have probability 


densities 
dH,(t) = e~“' nw, dt, 


K-1 
dH,(t) = enmat at) dt, 


respectively. To write down the differential-difference equations governing the system the 
service phase of the current customer must be specified and we therefore are concerned with 
three discrete variables. Let P.,,,,(¢) be the probability ofr priority and n non-priority custo- 
mers at time ¢, the customer in service being in phase m (m = K, K —1,...,2,1). The phases 
are numbered in reverse order, a customer passing first through phase K, then phase K — 1, 
and finally phase 1 before leaving the system. The suffix m appears only when n > 0 and 
P(t) denotes the probability of r priority and zero non-priority customers. We will assume 
that the system is empty at time zero, i.e. 


# 


rmn 


(0) = 0; Fo(0) = Fo. (2-1) 


The method carries over directly for more general initial conditions. 

The ‘forward’ differential-difference equations for P.,,,,(¢) can be written down by con- 
sidering all possible transitions from the point (r,m,m) in the time interval (¢,¢+ dt). In 
fact Morse (1958, p. 72) has given the difference equations satisfied by the equilibrium 
probabilities for the Poisson input, Erlang service time queue and this system can be easily 
generalized to incorporate the behaviour of the priority queue. Introducing Laplace trans- 
forms 


Jrnn(8) = L{Payn(t)] = | ‘ cP. ,(t) dt 


d 


dt Pann(t| som 89rmn(8) me Pnn(), 


and noting that by 
it is apparent that the pure difference equations satisfied by 4,,,,,(s) are almost identical 
with those satisfied by the equilibrium probabilities. The only equation substantially 
changed is that for the initial state. Writing a = s+A,+ Ag, the equations for g,,,,(8) with 
initial conditions (2-1) are 


— 299+ /4 910+ H29ou = —1 (r= 0,n=0), (2-2) 

—(&+ 2) Jomi + H19imi + H29omeur = 9 (r= 0,n=1,1< m<K), (23) 

—(%+ Me) Jox1+A2Ioot+91Kit 4292 =9 (r=0,n=1,m=K), (2-4) 

— (0+ fe) Jomn + AzIomn—1+H19imn + H29oms+in =9 (r=0,n>1,1<m<K), (25) 
(+ Me) Joxnt+As9Ioxn1tAIixntH2Ionw =9 (r=0,n>1, m=), (26) 
—(%+ Hy) Ir +AGrr0+/49ri10 = 9 (r > 0, n= 0), (2-7) 


— (0 + fy) rma + AL Gram + Gram. =0 (r = 0, = 1, 1 <m< K), (2:8) 
(29) 


I 
3 

| 
is: 


— (0+ fy) Fer FAGp-rKi tAGrot+AGrik1 =9 (r>0,n= 
— (0+ 4) Jomn + A1Gp—1mn + A2Irmn—1 + 44 Gr+imn =0 (r > 0, n> 1, 1 < Ms K). (2°10) 





60 C. R. HeatHcotTe 


We seek the joint generating function 


F(8;z,y,2) = = 2"9r9(8) + = 2"H,(8; y, 2), (2-11) 
where H,(8;y, x) = eH S y""Frmn(8) 


The first sum on the right-hand side of (2-11) can be evaluated immediately. From (2-7) 


Iro(8) = Joo(8) Vi(8), (2-12) 
where 04(8) = (2py)“* {oe + fy — V/[(% + Hy)? — 4A }}. 
Then E 2 Go(8) = duals) 120408). (2-13) 


Multiplying (2-3)-(2-6), (2-8)-(2-10) by the appropriate power of yx”, summing, and using 
(2-12), yields for H,(s; y, x) 


My A — (a+ fy —A_x) H+ A,H,_, = —Az_xy*®vi goo (r > 0), (2-14) 
My Hy — (% + fla — Ag& — fay) Hy = fa(1 — y*ar) x x"Join 


+y{(a—-Agz—f4%)GJoo— 1} (r = 0). (2°15) 
The general solution of (2-14) is 
H, = Aw; + Bw — y*v7 Goo, 
where A, B are constants and w,, w, are the roots of the characteristic equation, namely 
W4(8,%) = (2p)? {oe + Wy — Agu —A/[(%+ fy — Aga)? — 4A, Ay} | (2:16) 
We (8, %) = (2puy)* {oe + fy — Agu +al[(~ + Wy —Agu)®— 4A, py]}}. 


Since there is only one boundary condition, (2-15), the constant B may be set identically 
zero. Then the value of A appropriate to (2-15) is 
Hal — ya) & 2™Gorn— Y{(1—Y™) H2Go0 + 1} 
fy(1 — We) — fla(1—y™) 
The unknown sum > 2"go;,,, a function of s and x only, can be found by substituting 
n=1 


A(s, y, x) = 





y = {1+(w,—1) 4, 433}7 


in the numerator of this expression and equating it to zero; 


x 2"GJorn = apy {fy(We— 1) goo — 1} {1 —a[1 + (we — 1) ye FF. 
n= 





Writing B(s, x) = {we(s, x) — 1} 443" 
the required generating function is 
F(8;2, y, 2) 
— A= 9) Goo, eyty*(1 + BE = 1} + dolla y(% — y*) (we — 1) — HeyX(1—y) (21 + BE 1} 
1-2, (1 —zw,) [a(1 + B)¥ — 1] [ay y(we— 1) —a(1—-y)] 


(2-17) 




















Preemptive priority queueing 61 


The joint generating function of the numbers of priority and non-priority customers, 
irrespective of the phase of the current customer, is 


, _ fy(1—2) (We — 1) Jog + e{1 — (1+ f)¥} 
paca ta {1 — 200) (00, = 1) {1-21 +B} : 


As a check, we note that the marginal distribution of the number of priority customers is 
the classical result 





(2-18) 


1,1) = Aad : 
F(s;2z, 1,1) = qinno.ie Dy (2-19) 

The generating function of particular interest, that of the non-priority customers, is 
F(s; 1, 1, 2) o fy(1 — 2) (w,— 1) Joo + 2{1 a (1 +B} (2-20) 





(8 +A,—Agx) {1—a(1+ f)¥} 

The solution is now complete except for the unknown go)(s), the Laplace transform of the 

null probability. An explicit expression for go.(s) may be found by the following well known 

argument. Since F'(s;1, 1, x) is a generating function it must converge for at least |x| < 1, so 

that zeros in x of denominator and numerator within the unit circle coincide. By Rouché’s 
Theorem the only zero of the denominator within the unit circle is that of 


1—2{1+ A(s,x)}* = 0. 
If this zero is x = £(s), then Joo(S) = {H4(we(s, §) — 1)}-}. (2-21) 
Using the Lagrange Inversion formula (Whittaker & Watson, 1950, p. 132) &(s), and hence 


Joo(8), can be found. The final result is too complicated for practical purposes and is not given 
here. 


A case of especial interest is that when the service time is constant, that is K = 00. Write 
ft, = 2K, so that the constant service time is of duration 1/v,. Then 


lim {1+ A(s, x)}¥ = exp {u,73"{w,(s, x) — 1}. (2-22) 
K> © 


With this substitution the preceding formulae hold in this case also. 
The equilibrium distributions, when they exist, are easily derived from (2-17). We use 
a generalization of Abel’s Theorem (Widder, 1946, Chapter V) which yields 


lim 89rmn(8) = lim Prmnlt) = Prmn 
s>+0 t>o 


whenever the limit on the right-hand side exists. It is easy to see that the equilibrium 
distribution exists when 1 > p, +p. = Ay#z1+A,Kyz!, and when this is so, we have directly 
from (2-18), the steady-state generating function 


fy(1—2) Poo 





; ; A spied : 2-23 
jim sF(s52, 1,2) = O@, 2) = FF, a)}{1—a[1 +400, 2} Lea 
Since (1,1) = 1, nan i<i’~te (2-24) 


Moments are easily found by differentiating ©(z, x). The expectation of the number of non- 
priority customers, denoted by 7iz, is 
a. — Pot2U1 — Prt Piel K))— (1K) pa} 
* 2(1 —p,) (1—p1— Pe) 


: a Pol 2[1 — py + Pi Voz") — Po} ‘ 
In particular Ne = . (2-26) 
P 2(1—p,)(1—py— Pe) 





(2:25) 








62 C. R. HeEATHCOTE 


3. MoDEL FOR BREAKDOWNS 

The preceding model is now modified to describe in a more realistic manner the way in 
which breakdowns occur. The essential difference between a breakdown and a priority 
arrival is that breakdowns can occur only when a customer is in service, and also a queue of 
breakdowns cannot form. Ifr denotes the number of breakdowns, then r can take only the 
values 0 and 1. The random walk in the (7, n)-plane takes place only on the lines r = 0 and 
r = 1, excluding the point (1,0). If r = 0, the three permissible transitions from the point 
(0,n),n = 1,2,..., are to the points (0, —1), (0,n +1), or (1,n). Ifr = 1 only two transitions 
are allowable, to (1, + 1)or (0,7). From the point (0, 0) the only transition possible is to (0, 1). 
If Grmn(8) denotes the Laplace transform of the unknown probability P.,,,(¢) and 
a = 8+A,+Ag,, then, for an empty system initially, the difference equations satisfied by 
en —(8+A,)doo+ HaYou = —1 (r= 0, n = 0), (3:1) 
—(%+ fle) Jom + #1 G1mi + H2Yomsu = 9 (r= 0,n=1,1<m< K), (3-2) 
—(%+ fe) doxrt+AsGoot+/iixit Hedo2 =9 (r= 0,n = 1,m = K), (3-3) 
_ (a + fg) Yomn ¥ Asdomn —3 + /4Qimn + Hedom+in = 0 (r = 0, n> 1, l<m< Kk), (3-4) 
—(%+ fe) doxnt+AcdoxnatAhKnt edn =9 (r=0,n>1,m=K), (3-5) 
—(8+Ag+/4) GimitArdom =9 (T=1,n=1,1<m<K), (36) 
a (8+ Ag+ fy) Qimn + A190omn + Ae Gimn—1 = 0 (r = 1,n > 1, 1 ms K). (3-7) 

Proceeding as before the joint generating function of breakdowns and customers is 


1 K a) 
G(8; 2,2) = got & XS YX 2U"Grnn(8) 


r=0m=1n=1 


P (8 + fly +Ag— Agu +A,z)[1—(1+)¥] [(s + Ag—Avt) Goo — 1] isin 
bial Ma(8+ fy +Ag—Agu)Ol—2(1+O)F] tit’ (3-8) 


where 6 = Os, x) = (8+ My tAy + Ag—Age) (8 +Ag—Agz) 


Hal8 + fy + Ag — Aga) 





The generating function of the number of customers is 


(1—a) (8 +Ag—ApgX) Joo — z[1 — (1+ 9)*] 


) Go 
(s+A, 5 a(1+6)¥] 





G(s;1,2) = (3-9) 


The null probability A-[qo9(s)] can be found using the Lagrange Inversion Theorem, but 
as in the preemptive priority model the calculations involved are lengthy. 
If d(x) is the generating function of the equilibrium distribution, then from (3-9), 


__ (l-aif : 
$(@) = 711+ 00, 2)” (3-10) 


where fy, denotes the null probability. This is easily found to be 


foo = 1—p2(1 +p), (3-11) 


which agrees with the result of White & Christie (1958, p. 90) in their study of the breakdown 
model for the negative exponential queue. Equation (3-11) implies that the condition for 








the 
the 


an 





a i i ee i i, 





Preemptive priority queueing 63 


the existence of the equilibrium distribution is 1 > p,(1+,,), which is less restrictive than 
the condition 1 > p,+ , for the preemptive model. The expected number of customers in 
the system, denoted by bx, is 


5 — Pt2(L+ 01+ P1Pate(a K))— (1—K) pol + pr)?} 12 
x= ena :, (3-12) 
(1—p2—/1P2) 


Formulae for the deterministic case, K = 00, are given, as before, by rewriting ~, as v,K 
and proceeding to the limit. 

For purposes of comparison we list here some formulae for the Poisson input, Erlang 
service time queueing model in which no interruptions occur, either in the form of break- 
downs or priority arrivals. If A,,, are respectively the arrival and service parameters; 
ho(s) the Laplace transform of the null probability; and if initially there are no customers 
in the system, then the generating function is 





ae) « CSET Ate") 
H(s,2x) = o4a.- at -5 , (3-13) 
where y(8,%) = (8+ fy +Ag—A_X) mz}. 


The null probability #—[ho(s)] can be found explicitly in this case in terms of generalized 
Bessel functions (Luchak, 1958). The equilibrium generating function is 


Zs (1 —2x) (1 — po) ; 
¥ (2) = 1—a{1 + (1—2) A,uz (>) 





and the expected queue length 





tine 
—_ ee 5 )Po} (3-15) 


If the rate at which interruptions occur, A,, is zero, then putting “4, = /9, it is easy to check 
that the formulae derived for the priority and breakdown models reduce to (3-13)—(3-15). 


REFERENCES 


Barry, J. Y. (1956). A priority queueing problem. Opns. Res. 4, 385. 

Heatucorte, C. R. (1959). The time-dependent problem for a queue with preemptive priorities. 
Opns. Res. 7, 670-80. 

Hearacore, C. R. (1960). A simple queue with several preemptive priority classes. To be published. 

Lucuak, G. (1958). The continuous time solution of the equations of the single channel queue with a 
general class of service-time distributions by the method of generating functions. J. R. Statist. 
Soc. B, 20, 176-81. 

Morss, P. M. (1958). Quewes, Inventories and Maintenance. New York: John Wiley and Sons. 

STEPHAN, F.. (1958). Two queues under preemptive priority. Opns. Res. 6, 399-418. 

Wuirtaker, E. T. & Watson, G. N. (1950). Modern Analysis, 4th edition. Cambridge. 

Waite, H. & Curistig, L. 8. (1958). Queueing with preemptive priorities or with breakdown. Opns. 
Res. 6, 79-95. 

Wipp_er, D. V. (1946). The Laplace Transform. Princeton University Press. 











oO bs 





Biometrika (1961), 48, 1 and 2, p. 65 65 
Printed in Great Britain 


A two-sample sequential #-test 


By J. HAJNAL 
London School of Economics 


1. THE BASIS OF THE TEST 


Observations are taken from two normal populations with (unknown) means yp, and 
#,, and common (unknown) variance o*. It is desired to test the hypothesis (H,) that 


by = by 
against the alternative hypothesis (H,) that 
(42 — My)? S ¢ 
o ; 


The ordinary non-sequential test in this situation is, of course, the two-sample t-test. 
Let the observations from the two populations be denoted respectively x; and y; (j =1, 
2,...). Suppose that NV, observations have been taken from one population and N, from the 
other. We compute . i al 





h Sn es gets 
where 7 a p? v" ik, Yj» 
(;-—%)?+ SX (y;-9)? 
= J j 
N,+N,—2 
1 1 
and "Ete 


Under A), t as computed according to (1) has the ¢-distribution with N,,+ N,—2 degrees 
of freedom. Under H, on the other hand it has the non-central t-distribution with N, + N, —2 
degrees of freedom and non-centrality parameter A = ¢/,/u. The non-central t-distribution 
(see, for example, Johnson & Welch, 1939) is the distribution of 


e+A 
a’ 
where A is a constant, ¢ is a normal variable with mean 0 and variance 1, W;, is distributed as 
x7/f (x having f degrees of freedom). 
In order to obtain a sequential t-test for this situation we compute ¢t at each stage, using 
all the observations already accumulated, and calculate the likelihood ratio 





p(t?|A,) 
1A) 2 
p(t?| Ho) ) 
where p(??| H,) is the probability density of ¢? under the hypothesis H;. Sampling proceeds as 
long as the inequality B -P (|) " 1-2 ™ 
1—a © p(t?| Hp) a& 


5 Biom. 48 








66 J. HAJNAL 


holds. (a and # are the risks of falsely accepting H, and H,, respectively.) When (3) is 
violated, no further observations are taken. Whether H, or H, is accepted, depends on 
which side of the inequality is violated. 

The ratio (2) is a ratio of the density function of non-central / to that of central é?. It 
is known (e.g. Introduction to National Bureau of Standards, 1951) that such a ratio can be 
expressed in the form 


+7] ] A??? 
Bin! | (4) 


exp (— 348) F[S ’2’ 2(f+#) 


where A is the non-centrality parameter, f is the number of degrees of freedom and F(a, b; x) 
is the confluent hypergeometric function. 
Using expression (4), the inequality (3) becomes 


B es N-1 
—- a9 F —— 


ae 1- 
"3 uN —2 =n lk a ©) 


where N = N,+N,. It is usually more convenient to take logarithms and write 


B g N-11. cue 1-£ 
nna < ou 2 eee] <™ a (8) 


We now need to show that the procedure described constitutes a valid sequential proba- 
bility ratio test as developed by Wald (1947). 

We suppose first that at each stage of sampling a predetermined number of 2- and y- 
observations are taken. Thus we may take observations singly, alternating between the 
two populations, or one x- and one y-observation at each stage, or one x- and three y-observa- 
tions, etc. (The effects of taking observations in groups are discussed in Wald’s book.) 

Denote the sequence of the first N values observed 2,,2»,...,2y3 Yy,Yo---:Yn by Sy 
and let P(Sy|H;) be the joint probability density function for Sy under hypothesis H;. 
p(Sy|H;) will involve the parameters y,, “,, and ¢ of which jointly sufficient estimators are 
available. These parameters may be expressed in terms of |¢|, o, ~, and the sign of ¢. The 
distribution of t? depends only upon |¢|, but not upon the other parameters. 

By a theorem due to Cox (1952)}, we may write 


P(Sy|H;) = p(#|H;)x R (j= 0,1), (7) 
where F does not involve €. 
Hence P(Sy|A4) a p(t | H,) . (8) 
P(Sy|Ho) — p(| Hp) 


It may then be shown by Cox’s argument that the test proposed has probabilities of error 
roughly equal to « and #, provided the probability is one that the procedure terminates. 

The certainty of termination may be demonstrated by comparing our test with a suitably 
constructed single-sample two-sided sequential t-test. We assume that our sampling pro- 
cedure is such that in each batch of N’ successive observations there are rN’ x-observations 
and (1—r)N’ y-observations and we consider the probability that the inequality (3) is 
violated when N = N’, 2N’, 3N’,... observations have been analysed. (This probability is 

+ The main difficulty in checking the applicability of Cox’s theorem arises from the last of his 


conditions. The set of transformations which carries an observation x, into a+6z,; and y; into a+ by; 
(-20 <a < ©; —@ < b < ©) satisfies this condition. 








8 


— 


4) 


5) 





A two-sample sequential t-test 67 


smaller than the probability that the test terminates at some point up to and including the 
Nth observation.) We suppose that our two sample test is carried out in a situation where 


eer 2 
Ue Ha)* = (aay. 


Now consider a single-sample two-sided t-test of the hypothesis H, that the mean yu of a 
normal population of variance o? is zero against H, that w?/o? = ¢?r(1—r). The risks of error 
have the same values « and £ as before. Suppose the test is applied to observations drawn 
from a normal population with variance o? and mean y such that 


H/o? = (C')?r(r— 1). 


At each stage t is computed. After N observations this ¢ will have the non-central t-distribu- 
tion with N — 1 degrees of freedom and non-centrality parameter ¢’,/{Nr(1—r)}. From the 
observed ¢ we calculate the likelihood ratio 


p(é|H,) 
p(t?| Ao) 


using in the numerator a non-central t-distribution with N — 1 degrees of freedom and non- 
centrality parameter ¢,/{Nr(1—1r)}, while the density function for the denominator is the 
ordinary ¢-distribution with N —1 degrees of freedom. With this value of the likelihood 
ratio we then observe whether the inequality (3) is violated. 

Compare these steps with the proposed two-sample t-test bearing in mind that, for 
values of N which are multiples of N’, 


1/u = Nr(1—r). 





The ¢ computed with N observations has the same distribution as the ¢ in the single-sample 
test except that there is 1 degree of freedom less. The likelihood ratio is also based on the 
same distributions except that in the two-sample test there are only N — 2 degrees of freedom. 
If the single-sample test ends with probability one, it will then follow that the same is true of 
the two-sample test. 

The certainty of termination for the two-sided single-sample t-test seems to be generally 
accepted. It may be proved by considering two one-sided single sample ¢-tests. For a 
two-sided test of the hypothesis H, that ~ = 0, against H, that ~?/o*? = 6? with a and # as 
the risks of falsely accepting H, and H,, respectively, we take the following one-sided tests, 
both of which are assumed to be carried out on the same set of observations as the corre- 
sponding two-sided test 

Hy: 4=90 against A,: plo = —64, 
Hy: 4=90 andagainst A: n/o = +6. 

For both these one-sided tests the risk of falsely accepting H) is set at a as before, while 
the risk of falsely accepting H, is taken as $f. Then it may be shown that the two-sided test 
will terminate no later than the point at which both the one-sided tests have ended. The 
fact that both these tests will end with probability one follows from the paper by David 
& Kruskal (1956). 

To sum up we may conclude that the proposed two-sample test will end with probability 
one and that it is a valid sequential probability ratio test. 


5-2 








68 J. Hagnau 


We now abandon the assumption that at each stage a predetermined number of obse=a- 
tions are taken from each population. Suppose, for example, that a medical researche: is 
concerned with the difference between men and women patients in a certain characteristic. 
He might then wish to take each successive patient, male or female, who becomes available 
and decide at each stage whether additional observations are needed. To cover such a situa- 
tion, suppose there is a some constant probability 7 (known or unknown) that an observa- 
tion be taken from the x-population. 

The probability frequency function of the observations may be written 


mx(1 —m)%v p(Sy| Hj), 
where p(S,y|H;) is now the conditional probability of the particular values (2,, 2%, ..., y,; 


Y1; Yo; ---» Yn ) given the sequence in which observations have been drawn from either the 
x- and or the y-population. The factor 


mNz(1—m7)Ny 


is independent of H; and cancels out in the likelihood ratio. To show that the test is certain to 
end, we note that, with probability one, as N > oo, 


1/u—> Na(1—7). 


By using this fact in combination with the method of proof sketched earlier for the case 
where the sequence of z- and y-observations is predetermined, it may be shown that the test 
will end with probability one. (In carrying out this proof we take 7(1 —7) inplace of r(1—1)). 

The proposed test may, by an extension of these arguments, be shown to be valid even for 
cases where the probability that the next observation be drawn from the z- or y-population 
changes as the sampling proceeds. Let 7, be the probability that the Nth observation is 
from the z-pcpulation. Then the proposed test is valid if 7, is the same under H, and Hj, is 
independent of all earlier observations and there exist numbers 0 < 7, and Myx, < 1 


such that for all V : 
TMmin. < 7 < Mmax.: 
The theory which has been outlined is an extension of that developed by Rushton (1952) 
and Colombo (1959). In Rushton’s paper the crucial formulae relating to the two-sample 
case contain misprints.} The techniques for practical application and estimation of average 


sample size which are described in 2-3 below are new. 


2. PRACTICAL COMPUTATION 


Running totals of =z;, Sy; and (Xx? + Ly?) can be accumulated as the observations come in. 
If we write 





(@-9) (@-9)? 
T2 = = (9) 
(Xa,)?  (2y;,)? S? 
2 6p. 4 Seer, 
Xx? + Ly? ( Y, + N, ) 
. t2 T? 
it follows that fre = 547 


T? and hence T?/(u + 7) required in (6) may be computed instead of expressions involving 
t. Once this is done there are various ways of checking whether equation (6) still holds. The 
logarithm of the hypergeometric function can be evaluated by Rushton’s approximation 


+ The corrigenda may be found on an unnumbered page wrongly placed before the first page 
(p. 287) of parts 3-4 of Biometrika, vol. 41 (1954). See vol. 42 (1955), p. 277. 








a 


ee — a 


2? 


1e 


ye @ 








— 


A two-sample sequential t-test 69 


(Rushton, 1952) or by the tables of the confluent hypergeometric function published by the 
U.S. National Bureau of Standards (1949). 

The simplest procedure, however, is to utilize another set of tables prepared by the 
National Bureau of Standards (1951). These tables, which will be referred to as the ‘N.B.S. 
tables’, are for use with the single sample sequential t-test. For selected L, n and é they give 
values of Z which are solutions of the equation 


L = In F[34n, 3; 4262] — 4né?. (10) 


The procedure for the two sample t-test may therefore be put as follows. At each stage 
calculate the T? of (9) and compute 


_W-P 
ue aT (11) 
Continue sampling so long as Zy < Zy < Zy. (12) 
The limits Zy and Z,, may be obtained from the N.B.S. tables, entered with 
n= N-1, 
4 
Ss oe 13 
Feu =1)} 7 
p 1-8 ay 
and L= hi (forZy), L= ik (for Zy). (14) 


(For convenience we shall write ¢ in place of |¢| from now on.) 


Table :. Observations and preliminary calculations 








x-observations y-observations 
= A ‘ aa ms . oo 
2 > 
NN, 4% 2a, ue mm 9 
; i - ome US teen ons — _ — 
2 — a= — _ — 1 + 64 + 6-4 — — 
3 — _— — —_ _— 2 +230 +429-4 —_ — 
4 2 —38 —-— 78 a — -- — — _ 
5 — — — —— — 3 — 49 4245 + 817 200-08 
6 3 +44 -— 34 -—1-13 3°85 —- —_ — —_ 
7 4 + 53 + 19 40-475 0-90 — _— — — 

8 — a — — ao 4 42338 +483 +12-075 583-22 
9 5 +100 +11:9 42-38 28-32 — — _ — 
10 — — —- -- ~ 5 +163 +646 +12:92 834-63 
11 6 #+15:7 +27-6 +410 126-96 — — — — 
12 7 +30-1 +657-7 +824 475-61 —_— _ — —s 


A way of setting out the calculations is illustrated in Tables 1-3. For the example shown 
observations were drawn at random from a normal population with mean and variance 
both equal to 10. The observations were randomly allocated to the two groups with equal 
probability. This corresponds to a situation where observations come in with roughly equal 
frequency, but not in strict alternation, from two populations whose means are in fact equal, 
i.e. 7 = 0-5 and HA, is true. The alternative hypothesis tested is that € = 2, witha = f = 0-05. 

As observations become available they are entered on the appropriate side of Table 1, 
where some additional calculations are also performed. The computation of stopping limits 








70 J. HAJNAL 


is shown in Table 2. Finally, in Table 3, figures from the two sides of Table 1 are combined 
to calculate the sequential criterion Z,y which can be compared with the stopping limits 
obtained in Table 2. In each table a new line is added for each observation. The procedure is 
governed by the N.B.S. tables and can be understood only by reference to them in conjunc- 


Table 2. The calculation of stopping limits 
(1) (2) (3) (4) (5) (6) (7) (8) (9) 














Lower limit Upper limit 
c A ‘ ee on ‘v 
Nearest N.B.S. Inter- Nearest N.B.S. Inter- 
_ = . limits polated limits polated 
_1 1 JwtN-)) - A —i 28 |e A , limi’ 
eo FS gr.” at tale O08 2 Gem. 2S we 
2 2-0 1-41 — — — — —- — 
3 1-5 1-15 — — — — — — 
4 1-0 1-15 —_— —_— — — — — 
5 0-833 1-10 — — —_— ae -- — 
6 0-666 1-10 — (0-22) —_— 4-98 4-48 4-73 
7 0-583 1-07 0-02 0-45 0-17 4:99 4-63 4-86 
8 0-50 1-07 0-19 0-68 0-36 5-05 4-81 4-97 
9 0-45 1-05 0-35 0-92 0-49 5-14 5-01 5-11 
10 0-40 1-05 0-52 1-17 0-68 5-26 5-22 5-25 
11 0-367 1-04 0-69 1-42 0-84 5:39 5-45 5-40 
12 0-343 1-03 0:87 1-67 0-99 5-53 5-68 5-55 
Notes: 


Column (3). Calculation of 3 by formula (13) for entry in N.B.S. tables. 

Columns (4)—(5). Limits from N.B.S. tables under L = —In 19 since a = £ = 0-05. 

Column (6). Linear interpolation between columns (4) and (5) for 6 computed under (3), e.g. 
0-02 + 0-07/0-20 (0-45—0-02) = 0-17. 

Columns (7)-—(8). Limits from N.B.S. tables under Z = In 19. 

Column (9). Linear interpolation between columns (7) and (8). 


Table 3. Computation of sequential criterion 
(1) (2) (3) (4) (5) (6) (7) (8) (9) 





N X(a? +y?) S? y-=z qT? u+T? Zy Zy Zy 
16-00 —- _ — -- —- —_— —_ 
2 56-96 — _ —_ — — — _ 
3 585-96 — — — — — — 
4 600-40 — a — — _ — — 
5 624-41 —- — — —- — — — 
6 643-77 439-83 + 9-30 0-197 0-863 1-14 a 4-73 
7 671-86 470-87 + 7-69 0-126 0-709 1-07 0-17 4-86 
8 1 238-30 656-18 + 11-60 0-205 0-705 2-04 0-36 4-97 
9 1 338-30 726-76 + 9-70 0-129 0-579 1-78 0-49 5-11 
10 1 603-99 741-04 + 10-54 0-150 0-550 2-45 0-68 5-25 
ll 1 850-48 888-89 + 8-32 0-0779 0-445 1-75 0-84 5-40 
12 2 756-49 1 446-25 + 4-68 0-0151 0-359 0-46 0-99 5-55 
Notes: 
i oa —_ =. (2a,)? (2ys)? 
Column (3). S? = X(a+y%) : - ¥ ‘ 
i —m@)2 
Column (5). T= Y mt ° 
‘ie 2 
Column (7) _ (N-1) T 





N 





— mn an atti - ae 2 Oe ee ae ee ae 





A two-sample sequential t-test 71 


tion with equations (11) to (14) above. In the present example at N = 12 we find that 
Zy < Zy. Hence sampling stops with the (correct) decision to accept Hp. 

Column (2) shows the computation of the value of 6 to be used for entry into the N.B.S. 
tables in accordance with formula (13). Normally the 6 so calculated will vary from line to 
line and will not correspond to any of the 6’s provided for in the N.B.S. tables. To find fairly 
precise stopping limits interpolation is, therefore, necessary. Linear interpolation is almost 
invariably sufficient. (‘The accuracy of interpolation with respect to dis discussed on pages xii 
and xvii of the introduction to the N.B.S. tables.) Some of the interpolated limits in Table 2 
may be in error in the second place of decimals, but this is of no practical significance. 

Zy and Zy are only given in the N.B.S. tables from certain values of N onwards; i.e. 
whatever the observations turn out to be, it is necessary for N to exceed a certain minimum 
before it is possible to obtain a likelihood ratio which violates the inequality (3) and permits 
sampling to stop. Mathematically the reason is that for low N one side of the inequality can 
be violated only if Zy is negative and the other if Z, exceeds N — 1; neither of these alter- 
natives is possible. Sometimes, as in the example shown, a difficulty arises in determining 
one of the initial stopping points in Table 2. The N.S.B. tables give a value of Z, for N = 6, 
6 = 1-2, but under 6 = 1-0 the first Z, corresponds to N = 7. Is there a Zy for N = 6 and 
6 = 1-1? It is not difficult to compute an appropriate Z,, but the simplest solution is to 
adopt the rule that acceptance of H, first becomes possible at the N for which Zy is given in 
the N.B.S. tables for both neighbouring values of 6. A corresponding rule is, of course, 
appropriate for determining the smallest sample size for the acceptance of H,. The conse- 
quences of these rules will be a negligible increase in the average number of observations 
needed; the risk of either kind of error will not exceed the levels specified. 

The interpolation of the precise stopping limits between those given in the N.B.S. tables 
is usually only necessary for one or two observations at the end. In the example shown, no 
interpolation at all would have been required; a comparison of Zy with the limits provided 
by the N.B.S tables would have been adequate to establish at every stage whether sampling 
should continue. Much of Tables 1 and 3 need be computed only when N is large enough so 
that it is possible to stop sampling. 

If the sequence in which observations will be drawn from the two groups is fixed in 
advance, Table 2 can be computed before the experiment and the same interpolated stopping 
limits can then be used on several occasions. A simpler procedure which may often be 
practical is to use the N.B.S. limits as they stand for a value of d which provides a test having 
approximately the desired characteristics. For example, in a trial where observations are 
drawn in strict alternation from each of the two populations and it is desired to have 
roughly ¢ = 2 as in the example illustrated, one might well use the N.B.S. limits for é = 1. 
As may be inferred from column (3) of Table 2 this would provide a slightly more stringent 
test than intended, i.e. it would have the characteristics of a test in which ¢ or the risks of 
error were slightly smaller than those selected. However, since both ¢ and the risks of error 
have often to be selected in somewhat arbitrary fashion, this may be no inconvenience. 

In applications where observations from the two samples come in with roughly equal 
frequency, it is possible to take the observations in pairs and then use x, — ¥;, Xg, — Ya, etc., 
as successive values for a single sample sequential ¢-test (putting 6 = ¢/,/2.) More generally, 
if the observations are taken in batches of N’ at a time of which 7N’ are from the x-population 
and (1—7) N’ from the y-population, we may compute (1—7) Xx;—7Xy, at each stage. 
(Xx; and Ly; here stand for the sums of all the z and of all the y observations in one batch.) 








72 J. HAJNAL 


A single sample t-test with these values using 6 = ¢,/{N’m(1—7)} is then equivalent to the 
two-sample procedure. 

The single sample test is simpler to carry out and the advantage of the full two-sample 
procedure have to be weighed against the additional labour. The possibility of earlier stop- 
ping in the full procedure arises mainly because the degrees of freedom for estimating o? 
accumulate faster, but also because the process is examined more often and because, where 
the selection of observations is not under the experimenter’s control, there will usually not 
be exactly 7N’ and (1—7) N’ observations of the two kinds in successive batches of N’ 
observations. Where the collection of observations costs much time and effort, as for example 
in clinical medicine, the additional work of the full two-sample analysis will often be justified, 
especially where it is important to be able to stop early if the difference between the means 
compared appears large. When two treatments are compared in a clinical trial this is very 
important, since to continue with a trial when large differences have been revealed would 
imply exposing some patients to the risk of inferior treatment. 

The advantage of the full two-sample procedure in this respect can be substantial. For 
example, for € = 2,7 = 0-2 the full two-sample procedure may, if the observations so fall out, 
end with a decision that there is a difference between the means when N = 6. If 5 observa- 
tions are taken as a batch each time, such a decision cannot take place before 4 batches, 
i.e. 20 observations have been examined. 

Situations where it is not possible to take observations in equal numbers from the two 
populations being compared, or where the experimenter cannot control the selection of 
observations from one population or the other arise in various ways in clinical medicine. 
For example, in a trial of a new drug of which only small quantities are available, it may be 
desired to compare a small number of patients treated with the drug with a larger number 
ci controls; one of the two groups being compared in a study may be patients suffering from 
a rare disease, etc. The present method was in fact devised in response to the needs of a 
clinical trial. 

3. AVERAGE SAMPLE SIZE 


There is no known method of computing the average number of observations needed for 
sequential tests of composite hypotheses, but it seems worth while to apply to our tests 
a conjecture due to Bhate (1955). 

We consider only the case where H, is true. Let Ay be the logarithm of the likelihood ratio 
computed on N observations, i.e. 

P(Sy| A) 
P(Sy| Hy)” 

We suppose that repeated samples of fixed size (NV observations) are taken and compute 
E(Ay). E(Ay) is, of course, a function of VN. Under Hy, H(A) will decrease with increasing NV. 
Bhate’s conjecture says that the value of NV, say N, for which 


E(Ay) = (1-—a)In (4) +2m (=) (16) 
is an approximation to the average sample number of our test. 

Bhate’s conjecture is a natural extension to composite hypotheses of Wald’s well-known 
formula giving the average sample number for a sequential probability ratio test of a simple 
hypothesis. Ray (1956) has collected a number of instances where Bhate’s conjecture is in 
good agreement with the results of sampling experiments. 





~~ @© msm tS 





A two-sample sequential t-test 73 


To apply Bhate’s conjecture we have to take the expected value of 


2 N-11. C72 
Ay=- 5, +P | es ih 37H (17) 





for fixed NV. 

We consider the case} where the order in which observations are taken from each of the 
populations is not fixed, but there is a probability 7 that an observation comes from the 
x population. In (17) not only ¢, but also uw = 1/N,,+ 1N, is thus a random variable, though 
N,+N, = N is assumed fixed. We have 


a(*)- m(1—m)(N—1). (18) 


We follow Ray (1956) in approximating to the expected value of In F[4(N —1), 4; X] by 
In F[3(N — 1), 3; H(X)]. If has the F-distribution with 1 and f degrees of freedom, f?/(f + é) 


has a f#-distribution, and 2 
z( i ce (19) 
N-2+#) N-1V 


By virtue of the independence of u and ¢ our problem reduces to solving for N the 


equation _ 4(¢2m(1—m) (W—1)]+In FU — 1), 532m —7)] 
_ (1-ayin (-£-) +aim (**). (20) 


l-a 


Solutions of this equation for a = # = 0-05 and varying values of ¢ and 7 are given in 
Table 4. The equation is easily solved by means of the tables of the confluent hypergeo- 
metric function published by the National Bureau of Standards (1949) provided that an 
approximate value of N is available for a starting point. The starting point was obtained by 
noting that in these tables the function tabulated, namely 


1 
Jeanx) mle, $;X) 
is never far from 0-8 over the range likely to be relevant. Putting this value into equation 
(20), we obtain a quadratic equation for N — 1 and hence, for « = £ = 0-05, approximately 


= 10-5 
-1= waa: (21) 
Table 4 was obtained by solving equation (20). A comparison of this table with (21) 

shows that this approximation is sufficiently good in all cases where N is not below about 

10 to make it pointless to solve equation (20) precisely. When the mean sample size is small, 

Bhate points out that his conjecture, like Wald’s original formula from which it is derived, 

will tend to be an underestimate. 

The formula just stated shouid be compared with that which gives the number of observa- 
tions required for the equivalent test with a sample of fixed size, on the assumption that the 
variance is known. Suppose we wish to test the difference between the means of two popula- 
tions having a common variance a. We used a two-sided significance test at the 5% level. 


+ The case where the number of x and y observations for any N total observations is known in 
advance is, of course, simpler. It turns out that in the approximation (21) N takes the place of 


(N—1). 








74 J. HAJNAL 


We take a sample of, say, N* observations, N*7 being from one population and N*(1—7) 
from the other, where 7 is a given fraction. How large must N* be if the test is to have a 
95 % chance of giving a ‘significant’ result, when the true difference between the means is 
¢o? The answer is provided by the formula 
view 
~ Ga(l=7y 
The comparison between the two formulae (i.e. the fraction 10-5/13) is not a full measure 
of the saving in the number of observations provided by the sequential analysis, for two 
reasons. First formula (22) assumes that the variance is known. For a t-test of equivalent 
strength where 7¢?(1—7) is large, a somewhat greater N* would be required (e.g. for 
mC?(1—7) = 1 formula (22) gives N* = 13, using the non-central ¢-distribution we obtain 
N* = 15). Secondly, in a situation where observations cannot be freely taken from either 
population, it will in most cases take more than N* observations before 7N* x-observa- 
tions and (1 —7) N* y-observations have been assembled. 


* 


(22) 


Table 4. Average sample size under Hy 
(Bhate’s conjecture; a = £ = 0-05.) 


\a 0-1 0-2 0-5 
i\ 

1 117 66 43 
2 30 18 12 
3 14 9 6 
4 9 6 4 


For values of a and f other than 0-05 the constants in the numerators of (21) and (22) 
would, of course, be different from those given. For a = # = 0-01, the numerator of formula 
(21) would be 16 and that of formula (22) would be 24. 


Finally, it seems appropriate to record the application of Bhate’s conjecture to the single 
sample sequential t-test. The average sample size, N, if the true mean is zero, is the solution 
of the equation 


~ 482+ In F[RN, 4; 462] = (1—a)In (4) +aln (") (23) 


l-—a 


By comparison with formula (20) it is clear that a good approximation to the solution, 
for « = 8 = 0-05, is provided by 10-5/é?. 


I am grateful to Dr N. L. Johnson for drawing my attention to Bhate’s conjecture and 
making very helpful suggestions for the revision of the manuscript. I should also like to 
thank the referee for pointing out an important gap in the argument. 


REFERENCES 


Buate, D. H. (1955). Sequential analysis with special reference to distributions of sample size. 
London University Ph.D. thesis (unpublished). 

Cotomso, B. (1959). Appunti di metodologia sequenziale. Memorie della Accademia Patavina di SS. 
LL. AA. Classe di Scienze Matematiche e Naturali, 71, 3-30. 

Cox, D. R. (1952). Sequential tests for composite hypotheses. Proc. Camb. Phil. Soc. 48, 290-9. 

Davin, H. T. & Krusxat, W. H. (1956). The WAGR sequential ¢-test reaches a decision with pro- 
bability one. Ann. Math. Statist. 27, 797-805. 





— 


2) 
la 


ro- 





A two-sample sequential t-test 75 


Jounson, N. L. & We tcu, B. L. (1939). Applications of the non-central ¢-distribution. Biometrika, 
31, 362-89. 

Nationa BurEAv OF StanpDarps (1949). Tables of the Confluent Hypergeometric Function F(4n, 4; x) 
and Related Functions. Applied Mathematics Series, no. 3. Washington: Government Printing 
Office. 

Natrona Bureau OF STANDARDS (1951). Tables to Facilitate Sequential t-tests. Applied Mathematics 
Series, no. 7. Washington: Government Printing Office. 

Ray, W. D. (1956). Sequential analysis applied to certain experimental designs in the analysis of 
variance. Biometrika, 43, 388-403. 

Rusuton, 8. (1952). On a two-sided sequential t-test. Biometrika, 39, 302-8. 

Wa tp, A. (1947). Sequential Analysis. New York: John Wiley and Sons, Inc. 





—=— = a. ee ee ee, el ES, ce I i OO i ol bi te A ek, se i nl aes — — 4 











eer er es 


Biometrika (1961), 48, 1 and 2, p. 77 17 
Printed in Great Britain 


Absolute and incomplete moments of the multivariate 
normal distribution 


By 8. NABEYA 
Hitotsubashi University, Tokyo 


1. Introduction. In general it is difficult to evaluate the exact sampling distribution of 
a statistic which is a function of absolute values of some correlated variates, even when the 
parent distribution is normal. In such cases it is usual to evaluate the moments of the 
statistic, and to approximate the sampling distribution by, e.g. a Pearson type distribution, 
a Gram-Charlier series, or the distribution of some power of a variable which has a gamma 
distribution, etc. In these cases it is often necessary to evaluate the absolute moments of 
the multivariate normal distribution. Geary (1936) was the first to have worked according 
to this idea. 

The present author evaluated some of the absolute moments of the normal distribution 
with means zero, and in (1951) gave the results for the bivariate case up to 12th order. In 
(1952) he gave an integral formula for evaluating absolute moments from the character- 
istic function in the general multivariate distribution, and using this formula he derived 
the absolute moments of the trivariate normal distribution with means zero, also up to 
12th order. 

Working independently, Kamat (1953a) derived the same absolute moments for the 
bivariate and trivariate cases, and further obtained the results for the four-variate case by 
expansion in series. His method was to evaluate contributions to the absolute moment from 
each orthant, named incomplete moments, and sum up all these contributions. Incomplete 
moments were used also for other purposes. He argued that my method was not applicable 
to the evaluation of incomplete moments, and that his method was simpler than mine in 
evaluating certain absolute moments. Using the formulae for absolute moments, he gave the 
third moment of the mean deviation (1954), and of Gini’s mean difference (1953c),insamples 
from a normal population, and further he contributed to the sampling distribution of some 
statistics involving absolute values. 

To approximate the sampling distribution, it is often necessary to evaluate the moments 
up to the fourth order. Thus for a statistic which is a linear form of absolute values of 
normally correlated variables as in Geary’s (1936) case or in Kamat’s (19536) case etc., 
Nabeya’s (1951, 1952) and Kamat’s (1953) results given in exact form are not sufficient. 
It is desirable to evaluate E[|2,2.x3x,|] in exact form, where 2,, x2, %3, x, are distributed 
according to the four-variate normal distribution with means zero. In Geary’s (1936) and 
Kamat’s (1953) papers, the absolute moment of this form was found as a series expansion. 
Kamat (19536) calculated E[|x,x,7,%,|] in a special case of the correlation matrix, on the 
one hand in exact form using my integral formula, on the other hand in approximate form 
using his series expansion. He pointed out that the evaluation by my method was very 
elaborate even in his special case. 

In §2 of this paper I shall give an extension of the integral formula given in Nabeya 
(1952) for the evaluation of absolute moments, and I shall note that the extended formula 
may be used for evaluating incomplete moments. In §3 I shall prove, using the formula 





78 S. NABEYA 


given in §2, the single integral formula for E[sgn (x,2.%,2,)] originally due to Schlafli 
(1858, 1860), in the case of the general correlation matrix, and then shall give Z[|x,x.73%,4|] 
an exact expression involving E[sgn (7, 7.732%,)]. Section 4 will be devoted to the absolute 
moments E[|af: xy22x$s xf4|] for the cases 2 = n, > n. > my > m > 1. Finally in §5, using 
the results given in §3, I shall give the fourth moment of the mean deviation and of Gini’s 
mean difference for a sample taken from a normal population. 


2. Elaf:...afrsgn (2;, -+-Xj,)] and the incomplete moments. Let 2,,...,x, be distributed 
according to an r-variate distribution, such that the absolute moments E[|z7 ... z?"|] are 
finite for any set of integers m,, ...,m, for which 0 < m, < m, ..., 0 < m, < n,, for a fixed 


r 
set of non-negative integers n,,...,n,. Putn = > n;. Let {jj, ...,j,} be an arbitrary subset 
j=1 


of integers of the set {1,2,...,7}, and {k,,...,k,} be its complementary. Clearly p+q = r. 
Then by a similar argument as given in Nabeya (1952), we can prove 


Elap: ... xfrsagn (2;,... jy) = E[xp.... xfrsgn (x;,)... sgn (%;,)] 


7 wo r dt; wes dt;, Bea | (1) 
PH APY J co byt, | OM... ORY do tiene’ 
where ¢(t,,...,t,) denotes the characteristic function of x,,...,2,, and the integral in (1) 
with respect to ¢;,,...,¢;, 18 to mean Cauchy’s principal value, 


, GQ f-4 d Cp —€p d 
im + t am i + | ) t,. 
a sail (| | J * €p —Cp ip 


As eees ‘pt 





If p = 0, (1) reduces to the well-known formula for obtaining the ordinary moments 
E{aj:... apr], and if n;,,...,”;, are odd and n,,,, ...,,, are even, (1) is nothing else than the 
formula for absolute moments, E[|2j ... x7'"|], given in Nabeya (1952). 


Now, let us consider the incomplete moment, 


[oo [Pate at nat) dy dy (2) 
0 0 


where f(z, ...,z,) is the density function of x,, ...,z,. For this purpose we form the expression 
{1 +sgn (2,)}... {1+sgn (x,)}. It is equal to 2” when z, > 0, ..., z, > 0, but it is equal to zero 
elsewhere except for a set of Lebesgue measure zero. Therefore, (2) is equal to 


j= Blah ... {1 + 9gn (a)}.. (1 +-8gn (2,)}] 
= EBay ++ Tpr sgn (2;,... X4,)]), (3) 


where the summation runs over 2” terms, {j;, ...,j,} being any subset of {1, 2, ...,r}. 

Here we add two remarks. First, iff is an even function, i.e. f(x,, ...,2,) = f(—2, ..., —2;,), 
as in the cases of the multivariate normal distribution with zero means, then the expecta- 
tion (1) is equal to zero when n + p is an odd number. Hence in these cases the summation 
(3) includes only 2’-! terms. Secondly, in the general case, if we are interested in the con- 
tribution to the absolute moment from other orthants, we can use 


+{lisgn(zx,)}...{lisgn(z,)}, instead of {1+sgn(z,)}...{1+sgn (z,)}, 
with appropriate signs. 











1) 


he 


2) 


on 
ro 


3) 


r)s 
a 
on 
mn- 





Moments of the multivariate normal distribution 79 


Example 1. Let f(x,, x2, x3) be the normal density function of x,, x, x3, with means zero, 
with variances unity, and with correlations p,;,. Find the value of the incomplete moment 


{ff Uj XXy f(xy, Xp, Xz) dx, daydxg. 


In this case, since r = 3, and n is an even number, we need only the expectations of type 
(1) for which p is 0 or 2. They are 


E[x}x_%3] = Pog + 2Pr2Pr3; 


2 4 
Ex} x,x, sgn (x, %2)] = > {V/(1 — pi2) (2013 + P12 P28) + (Pog + 2P12/13) SiN! p49}, 
2 . 
E[ajxgx, sgn (2, 2)] = o {/(1 — pis) (2012 + P1323) + (P23 + 2P12P13) SiN p53}, 


2 , 
Elajx,% sgn (x_%3)] = > {V(1 — pi) (1 + Piz + Pits) + (P23 + 212/13) Sin pos}, 


therefore, according to (3), the required value is 


1 
7, Wil — Pitz) (2013 + P12 P23) + (1 — pis) (2012 + P1sP 23) + (1 — 3g) (1 + Pie + Pia) 
+ (Pog + 2p 42/13) ($7 + Sin py. + Sin pyg + Sin pgg)}, 


in accordance with the result given in Kamat (1958), which includes all the incomplete 
moments for n < 4 in the trivariate case. 


Example 2. Let x,, %, %3, %, be distributed according to the four-variate normal dis- 
tribution, with zero means, arbitrary variances, and with correlations p;,. Find the prob- 
ability P(x, > 0, x, > 0, %, > 0, x4 > 0). 

The required value is an incomplete moment (2), putting r= 4 and n= 0. From (3) 
we have 

P(x, > 0, x, > 0, x3; > 0, %, > 0) = ats x sin"! 5, + 7g H[sgn (7 %2_%32,)], 
TT 1<j<k<4 
where E[sgn (x;x,)] = (2/7) sin~1p;, is given in Kendall (1948), and E[sgn (7,222 32,)] will 
be treated in the next section. 

It may be seen from the above examples, that in the normal case the expectations of the 
type (1) have in general simpler expressions than incomplete moments. Hence it may occur 
that in evaluating incomplete moments the above method is simpler than Kamat’s. 

Expectations of type (1) in the general case are used, e.g. in evaluating the moments of 
the statistic c,x, + ¢,2%2+¢, |x3| +c, |x,|, which is an extended form of the ones treated by 
Kamat (19534). 


3. E[sgn (x,22%32x,)] and E[|x,x,x%,%,|]. Suppose that 2x,, 2%, %3, x are distributed 
according to the four-variate normal distribution, with means zero, variances unity, and 
correlation matrix R = (p;,). According to (1), E[sgn (x, 22% 32,)] is given by 


1 » dt, di, dt. dt 
E[sgn (21 %_%3%,4)] = ae ? P(t, te, ty, ty) as tetatgty (4) 























80 S. NaBEYA 


where ¢(t;, ta, ts, t,) = e~**®*, and t is the column vector of ¢’s. By applying Plackett’s (1954) 
technique to the characteristic function, we can get from the right side of (4) the Schlafli 
integral (1858, 1860) for E[sgn (x,2,2,2,)]. For example, let 


1 0 0 0 oe eh et” 1000 
oo. a's ae i 0100 
i 23 Pea) KL = | Pt . ke : 
’ 0 Psz 1 Psa , i a . oo 1 0 
0 Paz Pas 1 0 90 pay 1 oe Roe 
then aR +(1—a) K;, (¢ = 1, 2,3) is also a correlation matrix, if 0 < a < 1. Let 


a ™ , dt, dt,dt,dt, 
F(a) = ai {{{_expl- 3t’'{aR + (1—a)K;}t] aa 
then {0F;(a)}/d« can be expressed by means of algebraic and arcsine functions of p,,’s. 


Let us show this for {0F,(a)}/dx. We have 


OF(a) 1 ¢ ° _ - t,dt,dt,dt,dt, 
ate — 74 eu || {expt $t’{aR + (1—a)K,}t] ee 





say, where G,(«), G(a), G4(~) have similar expressions. To calculate G,(a), carrying out 
the ymeyret first, with respect to ¢, and ¢,, then with respect to ¢, and t,, we have 


G,(a) = Va = — ff. exp |- x1— er .) {RY(a) a 2RY(cx) ts i r’ RY a a) a) oe dts dt, 


4n2 
= “i-e —atpe,) -) Sin Pat 12(%), 


where R}(«) is the cofactor of (j,k) element inthe determinant ofaR + (1—a)K,,andp® ;,(a) 
is the partial correlation coefficient, given that the correlation matrix is aR+(1—«) K,. 
Similarly for G,(«) and G@,(«). Similar expressions hold for {0F,(a)}/o« and {OF ,(a)}/2ar, 
so we have the Schlafli integral 


Blegn (:¢42%242%4)] = F(1) = F(0)+ 4 


p f 
“al. aoe Go *Pitaa(®) + TT a 


*"Wi« a Tiga ae (5) 


aR (a) 1, 





Ps, 





ps) sin- 1 of. 13(&) 


4|[. . 
"2 [sin P12 8iN™ fgg +[ ree Ja-< —" ps 2) ° sin-! pHi3(%) + wi- aaa ” ale PS.14() 


+ Pag Sim lanl) + 7 Pe in aig dal, (6) 





=5 2a a =" 5 ae pia snl) dex, (7) 


where the summation in (7) runs over six terms for which (j,k,1,m) is a permutation of 
(1, 2, 3,4) such that 7 < & and 1 < m. These formulae can be regarded as equivalent to the 
special expressions of ©{(P) given by Plackett (1954), but I believe that the above choice 
of K’s would provide a better basis for the calculation than his choice. 








— wes or 











t) 
Ali 


xy 











Moments of the multivariate normal distribution 


If p;, = p (j + &), then the above formulae reduce to 


Elegn (,2,2,,)] = 22 at gig t_ Og 8 
ED Fa a%a)l = Te oV—at) "T+ p—208™ (8) 
a ee 16 1 (1—p) a 
1p)? +4 -ain-2 
= = (sin p)?+ al. Ja= Ti-wa™ ij; (9) 
Se i cee 
7 = 70 Ja—a) sin" T5 To, 1: (10) 


Now we proceed to evaluate Z[|z,2.7,2,|] under the same assumption as above. It is 
expressed by (1) as 


© O4(t,, ty, ty, t,) dt, dtydtydt, 
E{|2,%2%5%4|] = all", “Oty Ota Ot, ot, tytatat, 5) 


The partial derivative appearing in (11) is of the form 


O4D(ty, te, ts, t,) 
At, Ot, ot, ot, 


therefore, in evaluating (11) we must find 


al {fl A( (t, to, ts, t,) ——— ie Hatha tyothe ti dt dt, dt,dt, 
tylotst, 


for all sets of non-negative integers y,, Y2, y3 and y,, such that y,+7Y2+73+ V4 = 0, 2, or 4. 
According to (4) we have H(0,0,0,0) = H[sgn (x, 7.x,2x,)], and the other H’s may be 
obtained by the same method as shown in Nabeya (1952). Thus we have 








= (t,, ta, ts, t4) x (polynomial in ¢’s of the fourth degree); 


A(Y1, Ya, Ys Ya) = 


4 i 
E{ |x %_%324|] = 7 {/ B+ 4/(1 — pis) (P34 + P13P14 + Pog Poa) SiN fog. 


+ 4/(1 — pits) (Poa + Pi2P 14+ PosPsa) SiN Pogas + V (1 — Pi) (P23 + Pi2P13 + P2aPss) SiN Pog.r4 
+4/(1 — p33) (P1a+ Pr2P 24+ P1sPsa) SiN Py4.23 + 4/(1 — P34) (P13 + P12P 23 + PraPsa) SIN Py3.04 
+4/(1 — p34) (P12 + P1sP2s + PrsP2a) SiN Py2.34} 
+ (Pi2P34+ PisP 24+ P14P23) E[sgn (21 %2%3%,)], (12) 
where RF denotes the determinant of R. 
The formula checks with the previous ones in all the cases where Pj. = P13 = Pig = 0, OF 
Pia = Pra = Pog = Poa = 9, OF Pyg = 1, Pre = Pog aNd Pyg = Pgq. In the case pj, = p (j + k), 
we have 


El |ar,252524|] = = {v0 + 3p) (1—p)*] + 6p(1 + 2p) (1 —p?) sin 5 cc 


+ 3p°E[sgn (2, %_%3%,)], (13) 





where E[sgn (2, 2.%32,)] was given in (8)-(10). 


4, El|atiazeafsxfs|] for cases 2 = n, > N2 > Nz > % > 1. In the 450 we give the 
absolute moments E[|«j:x32x3sx}«|] for the cases 2 = n, > n, > nz > MN, > 1. In evaluating 
these we have no more difficulty than in the two- or deun-sone cases. 


E[|32_23%,|] = (2/78 {/(Ry) (1+ pie + pis t+ Pia) 
+ (P34 + 2P13P14 + PosP 24 + 2P12P13P 24 + 2P12P14P 23 + Pi2Psa — Pi2P23P 24) SIN Psa. 
+ (Pog + 2P12P14 + Pog Psat 2P12P13P 34 + 2P13P14P 23 + PisP 24 — PisP23Psa) SIN Poss 


+ (Pog + 2012/13 + PoaPsat 2P12P14P 3a + 2P13P14P 2a + Pi4P 03 — PisPo4Psa) SIN Pog.4}, 
Biom. 48 





82 S. NABEYA 


E[ |x}a§275%4|] = (2/77) {y (1 — ps) [2p 32 + 2p is + 2pia + 2hs + 23a + Pie 
+ 4)12P13P23 + 4P12P14P 24 — 2P13P 14P 34 — 2P23P 24P 4 — 2P'isP 3s — 2PisP24 
+ 4P13P14P23P24 + Ry, Roo/(1 — P4)] + (Psa + 2P13P14 + 2P23P 24 + 4P12P13P 24 
+ 4P12P14P23 + 2Pj2P34) SiN Pay}, 

El |x} x}a§axq|] = (2/7)? (1 + 2p + 2pis + 23s + Pia + Phat Pas 
+ 8)12P13P 23 + 4P12P14P 24 + 4P13P14P 34 + 4P25P24Psa +t 2Pi2P34 
+ 2piisP3a+ 2pi4P3s — PisP 24 — PiaP 34 — P24P54 + 8P12P13P 24Psa 
+ 8f32P14P23P 34+ 8P13P14P23P 24 — 4P12P14P24P 34 — 4P13P 14P24P 34 
— 474 P23Po4P sat 3Pi4P34P5a)> 

Elxiasxgxg] = 1+ 2ph,+ 2pis + 2pi4 + 2p hs + 2ph4 + 25a 
+ 8P12P13P 23 + 8P12P14P 24 + 8P13P14P 34 t+ 8P23P24Psat 4Pj2P 34+ 4Pis Pa 
+ 4pi4P33 + 169 12P13P24Ps4 + 16P12P14P 23P34 + 16213) 14P23P24- 


5. The fourth moment of mean deviation and of Gini’s mean difference. Let 21, ...,%, be 
a sample from a normal population with variance o?, and let z denote their mean. Then 


1 ie. 4 
aa—% |z;-2| and g= \z;—z,| 
j=1 


n(n —1) - 
are the mean deviation and Gini’s mean difference for the sample, respectively. In this 
section we evaluate the fourth moments j,(m) and j4(g) of these two statistics. 

First, we consider y4(m). Let x; = z;—2Z(1 <j < n); then the only difficulty in evaluating 
444(m) is in finding the exact expression for E[|x,7.7,%,|]. It is given by (n— 1)? o4/n? times 
(13), putting p = —1/(n—1), and we have 


ey 





yim) = i eae aie V(—%) (2+ p%) + pin} 


+2 (nm — 1)*){,/(1 —p*) (1 + 2p?) + p(1 + 2p) sin p} 





+ (m— 1) | V{(1-+ 3p) (1 —p)9}+ = all + 29) yp") int F 


+ 3p?E[sgn (242424) . 


Geary (1936) has given H[|x,7.232%,|] the form of a series expansion in powers of (n — 2)-}, 
namely 

4 (n+1)* (n—3)8 
mm (n—2)¢ 


32 
ras aR (n —2)° 
+ 172 - 640 12736 = 47104 
(n—2)* (n—2)5 5(n—2)® 5(n—2)’ 








E[ |x %_%3%4|] = 


a 





+...}. (14) 


Both (14) and (n— 1)?o*/n? times (13), if they are written in the form of series expansion in 
p = —(n—1)-, agree with 


4 —1\?2 
ET |x, %_%32%,|] = 5) o*(1 + 3p? + 4p? + p*— 4p + 3 p> —48p7...). 








Moments of the multivariate normal distribution 83 
Now we proceed to the evaluation of y4(g). Let x, = z;—z,, then we have 


16 
= n(n — 1)4 {4n@E [zt.]+4n31H [|i2273|] 


H4(9) 
+ n'SB[ |a%,2%54|]+ 3n!L[adgats] + Jnl H[x3,224) 
+ Gn!) E[| 2.2344) ]+ 12n/1H[|ai,215%54|] + 6nE| |ar},245 2951] 
+ 6n!ME[ |ai,% 154) ] + OnE [| 2}, 013% 45|] + 3! 1H [| 23, 27542551] 
+ $UE| | xG2 54256) ] + MOH | 192% 3%442y5|] + 12nE[ [212013214 %29| ] 
+ L2n!\E[| 2192132 14%95|]+ 2H 22102437 4X56] ] + 30H | 24923724 Xgal ] 
+ L2nl\E[| 219213 % 94X35] + 2h HE] | 212% y5%pq%q5|] + 6n!H| | 219243 X24 X56! ] 
+ BNE [ |ary92y3% 45% qe] ] + Fn! Hr 121 45% 45% pq|] + gM! \H| |27y9%y4 2567s! ]}- 
Among the twenty-three expectations appearing in the above expressions, eighteen may be 
evaluated by means of the formulae already published. To evaluate the remaining five, 
(12) is useful. Let us, for example, consider L[|21.2%13%14%25|]. We have 
O(Xy2) = O(Xy3) = O(Xy4) = O(X25) = 20, 
P(X 125 X13) = P(Xy2, X14) = P(Xy3, X14) = 3, 
P(*y2, X25) = — 3, P(X13; X25) = P(Xy4, Xa5) = O, 
and 8gn (%42132%44%25) is determined by the order of magnitude of z,, 25, 23, z, and z;. In this 
case every order of z’s is equally probable, so that HE [sgn (21.%3%14%25)] = — 7s. Hence 
we get 


1 . : 1 
ET 422432 y4%05|] = tas (4 /5+2,/3sin-!} + 24sin-!— +4,/3sin— v3) +, | o*. 


/6 


We can obtain similarly 
1 : 
E[\y2%33%14%45|] = [7a (45+ 48/3 sin p+] o, 
1 
ET |%12%13%14%25|] = 7 (4+ 24/3) 04 
1 2\ 4 
ET | 12.43% 24%3q|] = 7 (8y3—8)+3 a; 
} { 1 2 ; ! Q2aj 2 
E[\242%3%04%35|] = \a 4,/5+ 8sin-13—8sin-! yet 83 sin-?./3) + 3%) 0%. 


The last expectation denoted by &(d,d,d,d,) has been obtained by Kamat (19536), numeric- 
ally, using my integral formula (Nabeya, 1952). Finally we have 


Si ew gy —* (72/34 267) + 4 =?" (1124 144,/3 + 21m) 
Pad) ~ 7 3(n 153 7 
(n— 2)®) 1 a! —s 
ee gee 100 ,/5+72,/3sin— } + 96 sin $+ 192 sin et 1/3 


1 
+1204 12/304 50") + + EN (36+ 64/24 485in-*4 + 96 sin-! — 


V3 
(n— =. |. 





(n— 2) 
7 


+6n-+4,/30+4n*) + (12/3 + 277) +- 








84 S. NABEYA 


I express sincere thanks to Prof. E. 8. Pearson and the referees for calling my attention 
to Plackett’s and Schlafli’s work. 


REFERENCES 


Geary, R. C. (1936). Moments of the ratio of the mean deviation to the standard deviation for 
normal samples. Biometrika, 28, 295-305. 

Kamar, A. R. (1953a). Incomplete and absolute moments of the multivariate normal distribution 
with some applications. Biometrika, 40, 20-34. 

Kamat, A. R. (19536). On the mean successive difference and its ratio to the root mean square. 
Biometrika, 40, 116-27. 

Kamat, A. R. (1953c). The third moment of Gini’s mean differences. Biometrika, 40, 451-2. 

Kamat, A. R. (1954). Moments of a mean deviation. Biometrika, 41, 541-2. 

Kamar, A. R. (1958). Incomplete moments of the trivariate normal distribution. Sankhya, 20, 
321-2. 

KENDALL, M. G. (1948). Rank Correlation Methods. London: C. Griffin and Co. 

NaBEyA, 8S. (1951). Absolute moments in 2-dimensional norma! distribution. Ann. Inst. Statist. 
Math. 3, 2-6. 

NaBeyA, 8. (1952). Absolute moments in 3-dimensional normal distribution. Ann. Inst. Statist. 
Math. 4, 15-29. 


Puackett, R. L. (1954). A reduction formula for normal multivariate integrals. Biometrika, 41, 
351-60. 


ScuLar.i, L. (1858, 1860). On the multiple integral | dxdy ... dz whose limits are 


Pi, = 4r+b y+... +hyz > 0, pp > 0,...,9, > 0 and a+y?+...4+2% < 1. 
Quart. J. Pure Appl. Math. 2, 269-301; 3, 54-68, 97-108. 





a oi —_ kl, 








Biometrika (1961), 48, 1 and 2, p. 85 85 
Printed in Great Britain 


Asymptotic expansions for the mean and variance of the 
serial correlation coefficient 


By JOHN 8. WHITE 
Mathematics Group, General Motors Research Laboratories 


SUMMARY 


Two models are considered for the estimation of the serial correlation coefficient « of a 
first-order auto-regressive Gaussian process. Series expansions are obtained for the first 
two moments of @, the least squares estimator for a. The series expansions are carried to 
terms of order 7'-* and «* (where 7’ is the observed length of the series) thus extending the 
asymptotic results of several authors (e.g. Bartlett, 1946; Hurwicz, 1950; Kendall, 1954; 
Marriott & Pope, 1954). 


INTRODUCTION 


We consider the discrete Gaussian process (x,) satisfying the stochastic difference equation 
x = AX + UU, (1) 


where « is an unknown parameter and the w’s are independent normal variables with zero 
mean and variance o?. The distribution of a sample x = (2,,...,x7) is not uniquely deter- 
mined by (1) but depends also upon the specification of an initial condition. If we take 
% = c (a constant) then the density function for z is 





T 
—> (%—- wilt 


f(a) = a 1 (2) 


A second possible initial condition is that the process be stationary. This implies that each 
x, is N{0, o?/(1—«?)} and that the density function for z is 


= r 
F(a) = (204) 4"(1 —a8\bexp [5 [(1—a%)at + ¥ (e,—an, I. (3) 


We shall speak of the process which satisfies (2) as model A and the process which satisfies 
(3) as model B. To differentiate between the two models we shall frequently affix an asterisk 
to expressions referring to model B. 


MAXIMUM LIKELIHOOD ESTIMATORS 
For model A the maximum likelihood estimators are 


Q = — 1 g2 — 2% BH)" 


pe T 


It may also be noted that @ is the least squares estimator as obtained by minimizing 


T 
~ (a, — ca,_y)? = Du7. 





86 JoHN 8. WHITE 
Following Koopmans (1942) we define the statistics 
L = x32+23, 
M = 2X1 %_g4+%q%yt ... +Lp_ Uy, 
N = 23+... +24_). 
For model B the maximum likelihood estimators are then the solutions of the equations 


a 





(M — aN) (1 — a?) — 7, (L—2aM +N +a2N) = 0, 
2g _L—-2aM+N+a?N 
o* = ‘ 
if 
The first of these equations may be written as 
T-—2M L+(T+1)N T M 
i OP cen ee a ae. 
Ne) = @-n Wy" —¢-nw *+P iN = * 


The three roots of g(x) may be located by considering 


2M+2N+L 


g(—1) = —N(T-1)_ > 0, 


g(— 0) = — ©. 
Thus g(a) = 0 has three roots, say a*, af, and a¥ such that 
at< —-l<at*<1< af# 


and the maximum likelihood estimator for « is «*, the unique root between —1 and +1. 
For large 7’ we have 
M 


M 
~ 3 2 - 
g(a) =~ a —We a+ 


= («-7) (a?—1), 


Hence, the three roots of g(a) are asymptotically 1, —1 and M/N. 
For our two models we then have as estimators for « 


_ Uy Lgt+Xe%yt ... +Xp_yLp 


2 2 2 
ei+az+ ... +254 


M _ %j%_+%_%yt ... + p17 


| e+... +25, 


Since these two estimators are asymptotically the same, we shall focus our attention on 2 
and consider some of its properties with respect to both models. 











The mean and variance of the serial correlation coefficient 87 


Model A, x, = 0 
We shall consider model A first for the case c = 2 = 0. Since @ is independent of o? we 
may also set o? = 1. Following the procedure used by Williams (1941) and Hurwicz (1950) 
we set U = Xa,%,_, and V = Xa? ,. The joint moment generating function of U and V is then 
E(exp {Uu + Vv}) = m(u, v) = [exp (Uut Vo) f(x) da 
= (2n)-4 exp (— 4xDz') dx = |D|-4, (4) 


where D is the 7 x T' matrix whose determinant is 


pi-pr=| * ? (5) 








0 qi i 
p=1+a?-2v, g= —(a+u). 


The moments of @ = U/V may then be obtained from 


Ok 
win P ASML asa 


Hurwicz has noted that even for small values of T'(T' > 6) the integrals involved in evalu- 
ating H(&) are hyperelliptic and their evaluation in closed form is next to impossible. On 
the other hand it is feasible to expand the integrand in a Maclaurin series and integrate 
termwise. 

It turns out that #(@) is an odd function of « while #(@?) is an even function of «. Setting 
B = a?, we define 








2 
FA) — Q(B) = (0) + Q'(0) 2+ "0+ 


E(a2) = R(f) = R(0) +. R'(0) + R"(0 HE +... 


To expand the integrand of (8) in a Maclaurin series we note that, since D(T’)- = m(u, v), 
we have, for example, 




















0 
018) = me 
1/9 1). s@D(T) 
=~ 3) 2D | “6 
o @ oD(T 
Q"0)=-3] sala DT) att. pot 


with similar expressions for the remaining coefficients in Q(f) and R(f). 





88 JOHN S. WHITE 


We note from (5), expanding D(7'’) by elements of the first row, that D(7’) satisfies the 


diff ti 
ifference equation D(T) = pD(T —1)—q@2D(T — 2). (7) 


From the initial conditions D(1) = 1, D(2) = p—q? we find that 
7 T—k-1 T—k—2\) _ 
D(T) = p* * > (- 1 (p( k )-( k JJ — 
T—-k-1 T—k-2 ia 
= = (-wF{e+A (TE!) erate 
(x = 1—2v,y = (a+u)*). (8) 


> 


p=0 


The various derivatives required to evaluate expressions such as (10) may now be obtained 
T 0/(a 
DIP .i.5) = 355 (5, DP) 


from (8). Writing 
a) 
D(T,0,0) = x7-1, 
D(T, 0,1) = (1 —2)a7-3(a—1), 
D(T,0, 2) = (T —3)x7-5{(T — 2)x?-2(T —3)x+(T—4)}, 
D(T,1,0) = —27-3(24+T —2), 
D(T,1,1) = —x?-5{(T — 2) a? + (T —3) (T—4)x—(T—3) (T—4)}, 


D(T,1, 2) = —27-7{(T — 2) (1 —3) a3 + (T —3) (T — 4) (T — 6) x2 —(T —4) (1 —5) (27-9) 
+(T'—4)(T —5) (7 —6)}, 








D(T, 2,0) = (7 —3) 27-5 (24+ 7 —4), 
D(T, 2,1) = (1 —4)x7?-" {2(T — 3) 2? + (T — 5) (T — 6) x—(T —5) (T —6)}. 
Asan example we evaluate Q’(0) explicitly. We have 


1 oD(T) 
a ou 


_ 1 aD(T) ey 


uxo % OY OU 





u=0 
_ (a+u) eD(T) 
ae 
_ ,eD(T) 
“ oy v=p 
D(T)\ 0 — D(T)\,-2: 











u=0 





0 
Q’(0) = i) {3D(T,, 0, 0)-4 D(7, 0, 1) D(T, 1, 0) — 2D(T, 0, 0)3 D(7, 1, 1)} dv, 
= ’ = —— g-i7T'+3) oS g-UT+5) 4 a g-u(T- \ dy, 


12 





~ (P+1)(T+3)(T+5)" 








In 








The mean and variance of the serial correlation coefficient 


In a similar fashion we obtain the remaining coefficients of the Maclaurin series. 





T2297 +3 “Bi yr aaa 
2) = @in@en ~ ptp pt Or), 
, 12 i 
00) = ersaeraH = tO, 
36(T'+ 8) 





2") = ys) +5) (T+ P49) = at OT ”), 


2 4 2 1203 1805 
E(@) = (1-74 qe- ps) ¢t ge tet 














J T?—4T +7 oe Se we be 
RO) = rays ~ PTT at OP) 
yy _ Fe+13T3—4724+21T74+63 | 5 22 77 m 
BO) = rays) (F+3)(F +5) ~ | i a )s 
2 
R"0) = 18(37? + 487' + 149) Bilin 


(T +1) (7 +3) (1 +5)(1+7)(T+9) 7m 


Re : 3.8 5 22 17 27 
E(@?) = (7-7et 7s) +(1-p+ 7-78) 0 + Fg of+.... 


Model B 
For model B (4) becomes 


E*(exp {Uu+ Vv}) = m*(u, v) = [exp(Uu+ Vv) f*(a) dx 


= (27)-47(1— a2) fexv( —4xD*x') dx 
= (1—a*)t|D*|-%. 


Where D* is the 7 x JT matrix whose determinant is 


D*(T) = 








0 qi 
p=l+a?-2Q, gq=(a+u), P=a*. 
Expanding D*(7’) by the elements of the first row we have 
D*(T’) = (p—f) D(T —1)— D(T — 2), 
where D(T7’) is as defined in (5). Combining (10) and (7) we find that 
D*(T) = D(T)—fD(T — 1). 


ee = 


Defining D* (7,2 v5) = 





opi Bay 


89 


(9) 


(10) 








90 Joun S. WHITE 
we have 
D*(T, 0,0) = «7-1, 
D*(T,0,1) = x7-3{(T —3) x—(T—2)}, 
D*(T,,0, 2) = (T'—3) (T —4) x75 (x—1)2, 
D*(T,1,0) = —27-3{x+ (7 —2)}, 
D*(T,1,1) = —(T—3)a7-5 {a2 + (T—5)a—(T —4)}, 
D*(T,1,2) = —(T'—4)a7-7{(T —3) a3 + (T — 4) (1 —7) a2 — (27-11) (7-5) 2 
D*(T, 2,0) = (T—3)27-5{2x + (T'—4)}, +(7—5)(7—6)}, 
D*(T, 2,1) = (T —4) a7 -7{2(T — 4) 22+ (T —5) (T—7)x—(T —5) (T—6)}. 


The values of D*(7',i,j) for i = 0, 1 agree with those obtained by Hurwicz (1950, p. 381). 
Proceeding as with model A, we expand the respective integrands in their Maclaurin 
series, integrate and multiply by (1 —.«?)* to obtain 


2 
A°O) — QH(0) +. Q"(0) 8+ Q* E+... 


E*(Q2) = R*(0) +R*(0) B+ R*"(0)e Deices 


T?-27 +3 2 4 2 
ne ee an te 4 
20) = rays = pt pe st OF, 

' 2(T? +. 8T' — 21) 2 0 2 
* ite oe ae eS ee ee a 4 
2") = Go (PtH +3) P45) ~ ett OP), 

4 +2473 4 98T?— 2647 — 99 4 


0 
73 “Te O(T-+), 








O°") = 4 a) (P+) +3) T+5) (P+) (P49) ~ TAT 
2 4 2 2a3 2a 
- ate) e+ et et (11) 
Q*(0) and Q*”(0) agree with the values obtained by Hurwicz (1950, p. 381, eq. (4:8) and 
(4-10)). There apparently is a typographical error in Hurwicz’s value for Q*’(0) (1950, 
eq. (4°9)). 

The values associated with #*(@?) are 


T?—47T +7 : 4% 


E*(Q) = (1 





i sisal 
R°O) = a3yP—1) (FH) ~ PTA pst OF), 
asl = 47 +13 f-1 | =-1_ 
ila laa (a =) (P—1)* @—1) (+1) * (+1) +3) * (+3) 75 *?| 
5 21 73 
= 1- at page t OT), 
RO) = pHa ajo Pe... —— + — 60 = 27 — + a ae 
—4\(P—3) (741) * P—H(P+l)* P+ P+3) P+5)(T+7) | (+7) (149) 
8 40 
= - at OT), 


- 1 1 5 5 21 73 4 20 
E*(@?) => (p-a+ 7s) + (1-9 + 75-70) a+ (75—qa) a+ 





rin 


11) 


nd 
0, 


9) 





The mean and variance of the serial correlation coefficient 91 


VARIANCES 


From the series expansion given above we may also compute the variances and the 
quadratic risks for the two models. We find for model A 

E(2—«)? = E(&?)—2fE(@)/a+ 

a T?-—4T +7 + T? — 6T? + 25T — 42 B 

~ (T-3)(T-1)(74+1) (7-1) (7 +1) (7 4+3) (7 +5) 

a 
+ (P41) (P43) (P+5) (P+ 49)" 1” 
oe ee: 1 14 73 
— ee eee eee 24 

ticles (7 tT) (a+ 7) p+ qh + 


” ” . ae 1 10 57 
o* = E(68)- (2) = (5- gat qa) - (7- T2 ee —+ 








For model B we obtain 




















[247 +7 
B*(8— 0) = oF =3)(P—1) (TF +1) 
1 +1 1 1 oT +1 
2 (eatacn-elcnersn- 7s Reet ranrsa)e 
1 36 12 
+(-g@-y@mrs 1) t (+1) (1+3)(145) (+3) (145) (147) 
15 B 
+oenreprin) = ah 
rinat—(iofed)-G-ReR le Ree 


ot = B*(@—-E*@))* = (5- gat qa) -(p-et 


AO. Be Tt 24 
p72 7s) —\p- 7 +7) 6-3 rh 
Thus we have, up to terms of order O(7'-) 


—pw2 
o? = o*? = B(R—a)? = E*(@—a)? = +—% 


T 





(12) 


Model A, x» + 0 


For model A it is also of interest to examine the moments of 2 when x, = c + 0. In this 
case the joint moment generating function corresponding to (6) is 


3 D(T +1) 
= + tt — 
m(u,v) = D(T)-* exp (5 ( Dt) ) : 
The first term of the Maclaurin series expansion of H(%)/a = Q,(/) is 
0° 1ldm 
Q.(0) - | =n. thee: 


om % OU ~ 


u=0,a=0 





eke? po 
-— e-Meta) fy KTH 4 (‘T — 2 + 02) a-K(T+3), dz. 
1 








92 JOHN 8. WHITE 
Setting a = $(7'+ 1), y = 4c? and integrating by parts we have 
Q.(0) = Yer yoAP(1—a,y)—™ (20-4 2y—3)P(L—a,y) + 204-293) }, 


where I'(1 —a, y) is the incomplete gamma function 


l'(1—a, y) -| e“u-* du. 
y 
From Tricomi’s asymptotic expansion for (1 —a, y) (Erdelyi (1953), p. 140, eq. 9-5 (4)) we 
have 








eon a 2a . 
(1—a,y) = er (14 et yma t OY +a) |, 
_ 1(2a+2y—2 7T 1 2 = 
@.(0) = gee) eee = -reizat (asap): 


“ 2 
E(&@) = (1 - Fiza) BF cece 


We note that for c = 0 this agrees with (9). We also note that for large values of |c| = |z| we 


have E(8) 


— 21. 


a 


This supports a conjecture of Hurwicz (1950, p. 373). 
The calculations required for the remaining terms of H(@)/« and those for Z(@) are too 
laborious to be attempted at this time. 


COMPARISON WITH OTHER RESULTS 
Defining the serial correlation coefficient of order k as 
n—k 
x UU 4% 


t=1 
ap ie ay See 


n—k 
x a 
t=1 


Marriott & Pope (1954) have obtained 


k 
B*(a,) = at — eal +0(T-2). 
For k = 1 this reduces to E*(@) = a1 a 7) +0O(T-*) 


which agrees with (11). We note that (9) also gives 


EQ) = a1 cy +0(T-2), 


Bartlett has derived the asymptotic variance of «,, as 








1 (1 +a2) (1— a2 1 
var (a) = 7 tou = ' 2kat+0(7) 
1-2? 1 
var(a,) = = +0(7a)_ 


This is equivalent to (12). 





we 


we 


too 











The mean and variance of the serial correlation coefficient 93 


Several authors (Jenkins, 1956; Kendall, 1957; White, 1957a, b) have considered Leip- 
nik’s distribution (Leipnik, 1947) as an approximation to that of @. 
For Leipnik’s distribution we have 


. T - eT oe 
BQ) = peg (I-gt pe 7a)# 

r tH) Had 5 22 92 
E (a?) ~ (7- tgs) + (1-7 + pa—ga) 4 


: 24 vi ale 
E(@—«)? = (p- pet 7a) -(p— pat 78) 6 


Comparing these results with those obtained for model A we note that up to terms of order 
O(T-*) the means are the same, while second moments and variances differ only by 
1/T?+ O(T-%). The agreement with model B is only to terms of order O(T'-?). 

These results support the use of Leipnik’s distribution as an approximation for either 
model A or model B for large T (say T' > 20). This implies that we may also use approxima- 
tions derived from Leipnik’s distribution such as Quenouille’s t-test (Quenouille, 1949; 
White, 19575) 


.— 


(13) 





t= (@-—a)(T+1)}(1-@)-+ (df = 7+1) 
and Jenkins’ angular transformation (Jenkins, 1954) 


z=sin1é@ ~ N(sin a, 1/T). 


FINAL REMARKS 


Only the first three terms of the Maclaurin series for the first two moments of 2 have been 
obtained here. It would be of interest to have available the series for the third and fourth 
moments. However, the time required to compute the jth item in the expansion of c* is 
roughly proportional to 2/+*. 

It would also be of interest to examine similar expansions when the mean of the process 
is unknown and must be estimated. In this case an appropriate estimator might be 


U(x, —%) (21 — %1) where i:= Lay; . 


a a iT 





We then have (Kendall, 1954; Marriott & Pope, 1954) 


aie hk va 


MS + (1 7) a+0O(T-*) 
which differs considerably from the expected value of @. 
A third model for serial correlation is the so-called ‘circular’ case. Here the estimator is 


defined as : 
Oe 
c= Sar (Xp = Xp). 


Leipnik’s distribution was derived as an approximation to the distribution of 2, and thus it 
is reasonable to assume the series expansions for the moments of 2, will be similar to those 
of (13) and hence similar to those for 2. 








94 Joun S. WHITE 
On the other hand Kendall (1954) has shown that when the mean of 2, is unknown 


na) =—3+(1-A)2+0(3). 
Comparing this with H(%) we see that the two expectations differ by 1/7’. 

We might also consider the so-called ‘explosive’ case of model A, i.e. |x| > 1. In this case 
it has been shown that @ has an asymptotic Cauchy distribution (White, 1958, 1959), and 
hence has no asymptotic moments. It seems likely that the formal series expansions given 
above have a radius of convergence of 1 and hence are only valid for |«| < 1. 


REFERENCES 


Barttett, M. 8. (1946). On the theoretical specification and sampling properties of auto-correlated 
time series. J. R. Statist. Soc. Suppl. 8, 27. 

ERpDELYI, A. (Ed.) (1953). Higher Transcendental Functions, 2. New York: McGraw-Hill Inc. 

Horwicz, L. (1950). Least-squares bias in time series. Statistical Inference in Dynamic Economic 
Models, pp. 265-84 (editor T. C. Koopmans). New York: John Wiley and Sons Inc. 

JENKINS, G. M. (1954). An angular transformation for the serial correlation coefficient. Biometrika, 
41, 261-5. 

JENKINS, G. M. (1956). Tests of hypotheses in the linear autoregressive model. II. Biometrika, 43, 
186-99. 

KENDALL, M. G. (1954). Note on the bias in the estimation of autocorrelation. Biometrika, 41, 
403-4. 

KENDALL, M. G. (1957). The moments of the Leipnik distribution. Biometrika, 44, 270-2. 

Koopmans, T. (1942). Serial correlation and quadratic forms in normal variables. Ann. Math. 
Statist. 13, 14-23. 

Lerenik, R. B. (1947). Distribution of the serial correlation coefficient in a circularly correlated 
universe. Ann. Math. Statist. 18, 80—7. 

Marriott, F. H. C. & Pops, J. A. (1954). Bias in the estimation of autocorrelation. Biometrika, 
41, 390-6. 

QuENOUILLE, M. H. (1949). Approximate tests of correlation in time-series 3. Proc. Camb. Phil. Soc. 
45, 483-4. 

Waite, J. 8. (1957a). Approximate moments for the serial correlation coefficient. Ann. Math. 
Statist. 28, 798-802. 

Waite, J. 8. (19576). A é-test for the serial correlation coefficient. Ann. Math. Statist. 27, 1046-8. 

Waite, J. 8. (1958). The limiting distribution of the serial correlation coefficient in the explosive case. 
Ann. Math. Statist. 29, 1188-97. 

Wuirte, J. S. (1959). The limiting distribution of the serial correlation coefficient in the explosive 
case, II. Ann. Math. Statist. 30, 831-4. 

Wiis, J. D. (1941). Moments of the ratio of the mean square successive difference in samples 
from a normal universe. Ann. Math. Statist. 12, 239-41. 





ase 
nd 
yen 


ted 


mic 











Biometrika (1961), 48, 1 and 2, p. 95 
Printed in Great Britain 


Significance tests for paired-comparison experiments} 


By T. H. STARKS anp H. A. DAVID 
E. I. du Pont de Nemours and Company, Inc. and Virginia Polytechnic Institute 


1. INTRODUCTION 


The method of paired comparisons was introduced by Thurstone (1927) as a way of 
assessing the relative strengths of several ‘stimuli’ when no meaningful absolute measure- 
ments are possible. An important situation to which the method may be applied is that of 
preference testing, in which the aim is to arrange on some sort of scale the preferences 
expressed by a panel of judges for a number of treatments subjectively compared two at 
a time. Although many papers dealing with the analysis of paired-comparison experiments 
have appeared since 1927 little has been written about tests of significance involving in- 
dividual treatments or subgroups of treatments. One exception is a paper by Scheffé 
(1952) who used Tukey’s test based on allowances to separate treatments into significantly 
different groups, but required a scoring of the individual comparison decisions and certain 
normality assumptions. 

For simple (i.e. unrepeated) paired-comparison experiments David (1959) gave significance 
tests, resting on a minimum of assumptions, for a particular treatment, for the difference 
between two particular treatments, and for the treatment most often preferred. In the 
present paper these tests are extended to repeated experiments. Also three multiple com- 
parison procedures for separating significantly different treatment scores are introduced. 
A numerical example illustrates the use of each of the various methods. 


2. NOTATION AND ASSUMPTIONS 


We shall consider balanced paired-comparison experiments with ¢ treatments and n 
repetitions. The score of treatment 7 (i = 1,...,¢), ie. the number of times treatment 7 is 
preferred in the experiment, is denoted by a;. The symbol 7;; represents the probability 
that treatment 7 is preferred to treatment j in a paired comparison. Ties are not permitted 
so that m,=1—my; ((+5,j=1...8). (1) 
It is assumed that the 7;; remain constant throughout the experiment. Thurstone (1927) 
and Bradley & Terry (1952) propose two models which allow the 7;; to be expressed in 
terms of t—1 independent parameters. These are examples of models sometimes called 
‘linear’, since estimates of the treatment effects may usefully be represented by points on 
a line. The tests developed in the sequel are not tied to a particular linear model, but 
provide exact or approximate tests under any linear model. Indeed, these tests may still 
be used when no linear model is appropriate, a situation which arises, for example, when 
M3 > 4, 1, > 4 but 7, < 4. 

We shall represent the average preference probability of treatment wu (w = 1, ...,¢) by 


my. = B Mult), (2) 


where ~’ indicates summation over all treatments except w. 
+ This work was sponsored in part by the Office of Ordnance Research, U.S. Army. 


96 T. H. Starks anp H. A. Davip 


The significance test methods in this paper are developed under the null hypothesis 


Hy that 
o (say) tha mij = for all i,j. (3) 


The tests developed under Hj, are shown to be valid, but occasionally conservative (i.e. the 
true probability of rejection of the null hypothesis when it is true is less than the stated 
test significance level), when testing certain more general null hypotheses for which H, is 
a special case. 


3. TEST OF ONE PARTICULAR TREATMENT 


Although the principal reason for the paired-comparison experiment may be to obtain 
a ranking of the treatments, the experimenter may be especially interested in whether a 
particular treatment wu is better than the average; that is, he wishes to test 


A: TM. = $ (4) 
against H,: 1. > 4- (5) 
Under Ay: m3; = 4 for all ¢, j, (6) 


the score a,, of treatment u is a binomial variate with parameters [4, n(¢— 1)], and the usual 
binomial test can be applied to test Hj versus H,. If 


a4. = 3, (7) 


but not all the 7,,,’s are equal, a,, is a generalized binomial variate (Cramér, 1946, § 16-6) 
with the same expected value as under H), but with a variance that is less than it was under 
H,, by the amount 


n a (74:- 3). (8) 


It can be shown (cf. David, 1960) that the binomial test of H, versus H, will, if it is not 
exact, be a conservative test procedure. 
The binomial property, of course, also can be used to test the alternative hypotheses 


His: T.<$ and Hyg: 7, +t. 


4, TEST OF EQUALITY OF TWO PRE-ASSIGNED TREATMENTS 


Consider the case in which interest is expressed, prior to the performance of the ¢-treat- 
ment experiment, concerning the existence of a distinguishable difference between treat- 
ments u and v. After the experiment, we may test 


Hy: My, = TM, 
| (9) 
against H,: 1. ¥ M.- 


The test procedure is again developed under 


Hy: 7; = 4 for all (i, j). 











ser 





— 








Significance tests for paired-comparison experiments 97 
For any positive integer m, let 


Puem = Pr (|@y—4,| > m|Ho) = 2 Pr (a,—a, > m|H}) 
ni—-1) on 
=2 ¥ % [Pr(In all comparisons between treatments u and v, u is preferred 
d=m p=0 _k = (2p—n) more times than is v) x Pr (treatment u is preferred 
(d —k) more times than is v in comparisons with the other (t — 2) 


treatments) ]. 
nit—-1) n /n n(t—2) n(t— 2) n(t— 2) 
P =2 p ( ) Q-n ( ) ( 9—2n(t—2) 
an pe Pp re q q—d+2p—n 
mt=1) 2 (n\ nt=2) /n(t—2) n(t — 2) 
= 23n—2nt+1 ( ) ( ) ( ) ; 10 
an - Pp liad q q—d+2p-—-n ( ) 
The equality of the coefficients of x-4+#”—” in the expansion of the two sides of the identity 
(1 +22) (1 4 ar“1yn—) = (1 + 7)20-2) g—nlt-2), (11) 
; n(t—2) = ( n(t — 2) ) 2n(t — 2) 
gives = ; 12 
: me q q—d+2p—n n(t—3) —d+2p) - 
Substituting (12) into (10), we obtain 
nG—-1) 2 (xn 2n(t — 2) 
P = 238n-2nt+l i 13 
= ae (F) (nu 2) —a 20) um 


Hence to test Hj against H,, we follow the 


Test procedure 


(1) Choose the desired significance level «. 
(2) Find m, the smallest integer value of m for which P,,,,, does not exceed a. 
(3) Accept H, if |a,,—a,| is not less than m,. 


To test Hj against the one-sided alternative hypothesis 
Fi,2: Ty, > My, 


use the above test procedure with 4P,,,,,, in place of P,,,,,, in Step 2 and remove the absolute 
value signs in Step 3. The critical values of m for one- and two-sided tests when « = 0-01 
or 0-05 are given in Table 1. 

As was pointed out by David (1959) and as observed in Table 1, the distribution of the 
difference 

mM = A, —Qy 

tends rapidly toward the normal (0, 4nt) distribution as either n or ¢, or both, become large. 
The convergence follows from the fact that the characteristic function of m/,/(4nt) is 


Divan (@) = [cos fw J (2/nt) P"— [cos w /(2/nt)]", (14) 


which tends to e~}”* as n or t become large. 
For the same reasons as mentioned in the preceding section, this test becomes con- 
servative when H, is true, but differs from H5. 


7 Biom. 48 





98 


Experiment a= 0-01 
» 

—o FF One-sided Two-sided 
n i test m;, test m, 
1 <4 No significant values 
1 5 4 None possible 
1 6 5 5 
1 7 5 5 
1 8 5 6 
1 9 6 6 
1 10 6 7 
1 ll 6 7 
1 12 Ys 7 
1 13 7 a 
1 14 7 8 
1 15 ¥ 8 
1 16 Z| 8 
2 3 No significant values 
2 4 5 6 
2 5 6 6 
3 3 6 6 
3 4 67 7 
4 3 6t 7 
4 4 7 8 


All larger values of 


nort 


After running a paired-comparison experiment, the experimenter often wishes to know 
whether the treatment with the highest score is distinguishably better than the average 
of the ¢ treatments. If the treatment with the largest score is labellec as treatment (1), he 


T. H. Starks anp H. A. Davip 





+ Pas = 0-0103. 


5. A TEST OF THE HIGHEST SCORE 


wishes to test the null hypothesis 


against the alternative hypothesis 


score, a), under H5. 


Hy: ™, = 
Hy: ™, > 4- 


[o = (4nt)}] 


m, = smallest m, = smallest 
integer integer 


>2:3380+0:5 232-560+0-5 


Table 1. Critical values for the difference between scores of 
two pre-assigned treatments 





a = 0-05 
_ ies eo | 
One-sided Two-sided 
test m;, test m, 


No significant values 


~~ rh rh 


ana ao oO > 


ok; > 


or 


6 


m’, = smallest m, = smallest 


integer 


>1:640+0:5 231-960+0-5 


t Prog = 0-01001. 


Again we develop the test under Hj: 7;; = 4 for all (i, j). To perform a test of Hy against H,, 
one needs information concerning at least the critical part of the distribution of the largest 


6 
6 


integer 





| 
| 
| 
| 





TI 
nil 
na 


fo. 


wl 
su 





a? 


st, 





Significance tests for paired-comparison experiments 99 


If we let A; be the event a; > M [0 < M < n(t—1)], then the elementary law of prob- 
ability concerning the sum of events, plus the equi-probability of the events A; under 
H,, yields ‘ ‘ t 

Pr (aq) > M) = Pr( > 4\) ts (') Pr (A, Ay...A,). (16) 
i= j=1 
Aside from the trivial case M = 0, there exists one other set of values of M, namely, 
M > n(t—1)—4n, for which it is easy to evalute Pr (a,,. > MW). When M > n(t—1)—4n, 


; t nt—1) /n(t—1) 
Pr (aq > M|Hi) = (,) Pr(A,) =t.2- > ; (17) 
1 kom\ & 
since no two treatment scores can simultaneously exceed [n(t— 1) —4n]. 

When 0 <M < n(t—1)—4n, (18) 
it is difficult to determine the joint probabilities on the right-hand side of (16) since the 
scores are correlated binomial variates. However, we can use a simple approximation. 
For the values of M such that Pr(A;,) < 1/t, one can apply the Bonferroni inequality to 
obtain 


t.Pr(A,)— (;) .Pr(4,4,) < Pr(aq > M) <t.Pr(A)). (19) 
Since the sum of the treatment scores is a constant, we have 
Pr(A,|A;) < Pr(A;), (20) 
and therefore, Pr(A;A;) < [Pr(A,)F. (21) 
It follows that ¢.Pr(A,)— (;) [Pr (A,)}? < Pr (a > M) < t.Pr(A,). (22) 
n(t—1) = 
Since Pr(A,) = 2") . 7 (23) 
k=m 


can be calculated directly or found in a table of the binomial probability distributions, 
we have limits on Pr (a) > M|H,) that are easy to obtain. 

To make an approximate «-level significance test of Hj against H,, one should choose 
as the critical value that positive integer value of IM, say M,, for which 


t.Pr(a; > M,|Ho) =~ <a <t.Pr(a; > M,—1|H)). (24) 
When f = 0-05 and ¢ is large, we find from relations (22) and (24) that 
0-04875 < Pr (aq > Moos |H) < 0-05. (25) 


The range between the limits decreases as ¢ or £, or both, decrease. Hence the true sig- 
nificance level of the above approximate test of Hj versus H, is known to be between quite 
narrow bounds. 

If more accuracy is necessary in testing Hj, against H,, one may consider the general 
form of the joint density function of s (s < ¢) scores 


i,j week = 
S(Gj, Gj, ---5@,) = ghns2t—-s—) 5 i ( t—s ), (26) 


Pp ao a, 
where a, is the score of treatment p in a subexperiment between the s treatments, and the 
sum is over all outcomes of subexperiments compatible with the final score. By using (26) 


7-2 





100 T. H. Srarks anp H. A. Davip 


to find the joint probabilities on the right-hand side of (16), one obtains the exact value 
of Pr (a) > M|H{). It should be observed that this method will become increasingly difficult 
as n and ¢ become large. 

Again because the a; change from binomial to generalized binomial variates when we 
change from Hj to Hy, we have that the following test of H) versus H, is a conservative test 
when J, is true but H4 is not. 

Test procedure 


(1) Choose the desired significance level a. 
(2) From a table of the cumulative binomial probability distribution, find the integer 


M, such that t.Pr(a; > M,|Hj) = 8 < a <t.Pr(a,; > [(M,—1]|H}). 


(3) If the largest score in the experiment is not less than M,, accept the hypothesis that 
the treatment corresponding to that score is better than average. 


A similar argument would give the approximate significance test of whether the treat- 
ment with the smallest score is below the average of the ¢ treatments. 


6. MULTIPLE COMPARISON TESTS OF TREATMENT SCORES 


To separate significantly different treatment scores, two methods are introduced. One 
is a multiple comparison range test analogous to Tukey’s test based on allowances (see 
Federer, 1955, p. 29) for separating significantly different independent normally distributed 
sample means. The other method is analogous to Fisher’s least significant difference method. 


6-1. Multiple comparison range test 
The multiple range method for a paired-comparison experiment is as follows: 
(1) Choose a significance level a. 
(2) Find a positive integer Ry), say, such that 


Pr [ay—q > Ry») = B < % < Prlaq—aw > Ry.)—1)], (27) 


where a, denotes the ith largest score and the probabilities are calculated under the 

hypothesis ; ae nS 
yP Hy: my =4 (6 +9; 4,9 = 1,...,6). 

(3) Any pairwise difference in scores not less than R,,) is considered significant. 


This test is based on the argument that if any two scores differ by more than we could 
reasonably expect the two extreme scores to differ by under Hj, we should declare the pair 
significantly different. This test is conservative when the model is not linear (i.e. when 
Hy: 7;, = 4, for all i, is not equivalent to H;). 

Step 2 of the above test procedure requires a knowledge of the distribution of the range 
Ay) — Ay, Of the ¢ scores under H). There are ¢(t— 1) differences of the form (a;—a;); we are 
interested in the distribution of the maximum of the ¢(¢—1) differences. Equation (16) 
expresses the probability that the maximum of ¢ scores will not be less than a certain quan- 
tity M as the sum of k-fold (k = 1,...,¢) joint probabilities. If we let R,; be the event 
(a,;—a;) > R, where R is a positive integer, then the substitution of ¢(¢—1) and the R,,’s 
in the right-hand side of (16) for ¢ and the A,’s, respectively, gives an expression for 
Pr (a4) — ay > R). When 


R > n(t—1)—4n, (28) 





A emer 








the 


He 


— o> A & 











—— 


Significance tests for paired-comparison experiments 101 


the R,,’s are mutually exclusive events; and, since every R,; is equally likely under H,, 


Pr (ay) —adw > R|Ho, R > n(t—1)—4n) = t(t—1) Pr(R,,) (29) 
._ ae n(t—1) n (n 2n(t — 2) 


However, to find Pr (a,)—ay > R) when 
0<R< n(t—1)—4n, (31) 
it is necessary to evaluate joint probabilities of the form 
Pr (R,; ... Ryn). (32) 


The nature of the R,,;’s makes the evaluation of (32) difficult. For this reason, an approxima- 
tion to the distribution of (a) —a) is obtained that is reasonably accurate in the critical 
part of the distribution (i.e. 0-005 < « < 0-075). If we consider 


d, = 2a;—4n(t—1)] (nt)-4 (i = 1,..., 8), (33) 

then we have that under H, K(d;) = 0, (34) 
E(dj) = (t—1)/t, (35) 

E(d,d;)=—Ajt (t+). (36) 


Also, since a; is a binomial variate, d; is asymptotically normal. Therefore we may expect 
the distribution of the range (cf. Hartley, 1950) 


diy) — Ay = 2(Ay — A@)/ (nt) (37) 


to be asymptotically the same as that for ¢ independent observations from a normal popula- 
tion with variance o°(1—p) = (t—)/t-+1/t = 1. (38) 
The probability integral of the range W, (say) of ¢ independent observations (¢ = 2 (1) 20) 
from a normal population with variance one is given in Table 23 in the Biometrika Tables 
for Statisticians, Vol. 1, by Pearson & Hartley (1954). Hence Pr (W, > 2R/,/(nt)) is an easily 
obtained approximation to Pr(a,)—ay > R) and will improve in accuracy as n(¢—1) 
becomes large. 

The range W;, is a continuous variate; whereas the range (d,,,—dw) is a discrete variate 
with distinct values differing by not less than 2(nt)-+. This suggests that a continuity 
correction might improve the approximation. The usual correction would subtract (nt)-? 
from the observed range of the d;’s, 2R(nt)-4. However, it has been empirically found that 
more accurate results are obtained in the interval of significance levels (0-005, 0-075) 
through the subtraction of }(nt)-? from 2R(nt)-}. 

For small experiment sizes, Pr (a,.)— aw > R|Hj) can be obtained from the tables, giving 
the probabilities of all possible outcomes, found in Bradley & Terry (1952), Bradley (1954), 
and David (1959). In Table 2, the approximation method given above is compared with 
values obtained from the tables for the largest experiments given in the tables. Due to the 
asymptotic nature of the distribution of a,)—ay, we may expect the approximation to be 
better for larger experiments. 

To summarize, we shall restate the steps in this multiple range test of treatment scores. 





102 : T. H. Starks anp H. A. Davip 


Table 2. Comparison of Pr (aq)—ap» > R) with its approximation Pr (W, > (2R— 4) (nt)-4) 


t n R Pr(ay—ap) > R\H,)t Pr(W, > (2R—4) (nt)-4) 
3 10 12 0-0062 + 0-00003 0-0078 
3 10 ll -0157 -0152 
3 10 10 0340+ -00003 -0317 
4 6 11 -0094 -0104 
4 6 10 -0258 -0252 
4 6 9 0-0616 0-0560 
4 7 12 0085+ -00008 -0092 
4 7 ll 0217+ -00006 -0212 
4 7 10 0493+ -00005 -0454 
4 8 13 -0075+ -00009 -0078 
4 8 12 0-0178 + 0-00008 0-0174 
4 8 ll 0389+ -00006 -0363 
5 5 12 0071+ -00015 -0079 
5 5 ll 0199+ -00013 -0200 
5 5 10 0494+ -00011 -0460 
8 1 7 0-0068 0-0169 
8 1 6 -0738 -0778 


t The values of Pr(a,)—a > R|Hj) were calculated from the tables of Bradley & Terry (1952), 
Bradley (1954) and David (1959). Because of the construction of the Bradley-Terry and Bradley 
tables, it was sometimes necessary to subtract probabilities of smaller ranges from the cumulative 
probability entered in these tables. Each subtraction could cause errors of as much as +0-00005. 
The entries in such cases are of the form £ +0, where o = 0-00005 (k/3), and k represents the number 
of subtractions. 


Test procedure 


(1) Choose the desired significance level a. 
(2) Find a positive integer R4.), say, such that 


Pr [ay —a@ > Rew] = B < « < Prlaq—aw > Rya)— 1], (39) 


where the probabilities are calculated under hypothesis H). 

(a) If the experiment is small, use Table 3 to obtain Ry). 

(b) If the experiment is too large for method (a), find the upper 100« % point, W(a) say, 
of the W, distribution (Biometrika Tables, Table 22) and solve 


a) = (2R* — 4) (nt) (40) 
for R*. If R+, the smallest integer not less than R*, is larger than n(¢— 1) — 4n, use equation 
(27) to obtain £ and Rx); otherwise, set 

Ry.» = Rt. (41) 
Bla 
(3) Any pairwise difference in scores not less than R,,) will be considered a significant 
difference. 


This method is easily extended to large balanced subjective experiments with more 
than two treatments per block by using a generalized form of the d;-statistic in the previous 
argument. Let the blocks be of size k(2 < k < t) and the treatments in each block be scored 
from 1 to k, then the generalized form of d;, say d*, is 


dt = 2,/3 (a,—a)/[A(k +1) ¢}, (42) 








wh 


an 
tes 








Significance tests for paired-comparison experiments 103 


where 1 


and A is the number of blocks in which any particular pair occurs. The critical value of the 
test is Rt. 


Table 3. Critical values Rx.) for the multiple comparison range test} 








a= 0-01 a = 0-05 

f > sin = ag - ~ 
t n Ra) B PR ga) B 
3 1 None None None None 
3 2 None None None None 
3 3 6 0-01 None None 
3 4 7 01 6 0-05 
3 5 8 01 7 04 
3 6 9 0-01 8 0-03 
3 7 10 “01 8 05 
3 8 10 01 9 03 
3 9 ll ‘01 9 05 
3 10 12 ‘01 10 -03 
4 1 None None None None 
4 3 6 0-01 None None 
4 3 8 005 7 0-03 
4 4 9 01 38 03 
4 5 10 01 9 03 
4 6 ll 0-01 9 0-06 
4 sj 12 01 10 05 
4 8 13 ‘01 ll 04 
5 1 None None None None 
5 2 7 0154 6 08 
5 3 9 0-01 8 0-04 
5 4 11 005 9 05 
5 5 12 01 10 05 
6 1 None None 5 -0586 
7 1 None None 6 -0205 
8 1 7 -0068 6 -0738 


t These critical values Ry) have been chosen in such a way as to make # as close to a as possible 
rather than requiring that Ry, be such that # < « as was done in the defining equation (27). 


6-2. Least significant difference method 


A second and less conservative method for comparing treatment scores is one that is 
analogous to Fisher’s least significant difference method of comparing sample means in 
the analysis of variance. This method consists of first testing 


Hy: 1;.=4 for all a, 
against H,: 7;.+ 4 for some i. 


(This is analogous to the F-test in the analysis of variance.) For small experiments, the 
previously mentioned tables of Bradley & Terry, Bradley, and David may be used to test 
H, against H,. For any experiments outside the range of these tables, the approximate 





104 T. H. Starxs anp H. A. Davip 


x?-test suggested by Durbin (1951) has been found to be reasonably accurate (cf. Starks, 
1958). In terms of the notation used in this paper, the test consists of comparing 


t t 
D= B= 4] ¥ (a,—ap| | (44) 
i=1 i=1 


with the upper 100« % point of the y?-distribution with (¢— 1) degrees of freedom. 

If one fails to reject Hy, no significant difference between treatment scores is declared. 
If H, is rejected, we may apply the ‘test of equality of two pre-assigned treatments’, 
described in § 4 of this paper, to every pair of treatment scores. This is analogous to the t-test 
in Fisher’s least significant difference method. 


Test procedure 


(1) Choose the desired significance level «. 

(2) (a) For small experiments, use the available methods and tables of Bradley & Terry, 
and David, to test whether a difference between treatments exists. 

(6) For larger experiments, use Durbin’s method; that is, compare 


D=S@= | 3 at— dene 1) [inn (45) 
i=1 i=1 
with the upper 100«% point of the y?-distribution with (¢—1) degrees of freedom and 
reject the hypothesis of equivalence of treatments if D is greater than the critical y*-value. 
(3) If no significant difference between treatments is found in 2, the test is completed. 
If the test in 2 shows significance, find the critical value m, for a two-sided test on two pre- 
assigned treatments described above and declare every pair of treatment scores differing 
by as much as m, to be significantly different. 


7. A METHOD FOR JUDGING CONTRASTS OF TREATMENT SCORES 


An argument analogous to that employed by Scheffé (1953) to test contrasts of means 
in the analysis of variance can be used to develop a method for judging contrasts of treat- 
ment scores after finding the observed value of D in the paired-comparison experiment 
to be significant. Consider the set of orthogonal contrasts 


t 
Q, = & Lid; (k = 1,...,¢—1), (46) 
t 
where x Li = 9, (47) 
i=1 


and d; is as defined in equation (33). Using relations (34), (35), and (36), we obtain the vari- 
ance of Q, to be 


ae t 
S, = 2 7, Pig Lin Ligy 07 = = Lit. (48) 
From the theory of orthogonal contrasts 
t t-1 
D= ¥di= ¥ QS, (49) 


for every set {Q;} of (¢— 1) mutually orthogonal contrasts. Since the terms on the right-hand 
side of (49) are all non-negative, we have 


D>Q/S, (k=1,...,t—), (50) 











with 
of tk 


Hen 








Significance tests for paired-comparison experiments 105 


with equality occurring when LD, = d; (i = 1, ...,¢). Hence, if D, is the upper 100a % point 
of the distribution of D under Hj, 


Pr(D < D,) = Pr (for all possible contrasts, Q,, Q3/S, < D) = l-a. (51) 
Hence, after performing the D-test and rejecting 
Hy: 1;.= (for all 7) (52) 


at the « level of significance, the experimenter may judge all contrasts of treatment scores 
in which he is interested with the following. 


Test procedure 
t 2 t 2 
(1) Calculate @? = ( > L4d;) si 4 > La) / (nt), 
i=1 t=1 
i=1 
(2) Calculate S= > Lf. (54) 
i=l 


(3) Calculate SD,, where D, is the critical value used in the D-test. 
(4) If Q? > SD,, declare the contrast significantly different from zero. 


For small experiments, « and D, are determined exactly from tabled distributions and, 
for larger experiments, the approximation D, = x?4_,,,, the upper 100« % point of the x? 
distribution with (t— 1) degrees of freedom, is used. 


8. EXAMPLES 


To illustrate the test procedures presented above, they are applied to data from a paired- 
comparison experiment run on five brands of carbon paper and described by Fleckenstein, 
Freund & Jackson (1958). The experiment and analysis was carried out according to the 
method described by Scheffé (1952). Since the test methods in this paper do not allow for 
scaling or ties, the degree of preference is ignored and merely recorded as a preference and 
ties were randomly assigned as preferences to one of the two members of each tied pair. The 
number of times each paper is preferred in the thirty repetitions of the experiment is listed 
ee: a, = 66; a,=51; a,=89; a,=24; as =70. 


> 


We have arbitrarily decided to run all tests at the 5% significance level. 


8-1. Test of pre-assigned treatment 


Suppose that before the carbon paper experiment, brand 5 had received considerable 
recommendation because of its low cost and reputed quality. In such a case, the experi- 
menter may be particularly interested in testing whether brand 5 is better than the 
average of the five brands. The test of H,: 7, = 4 against H,: 7; > 4 would proceed as 
follows: 

(1) Significance level: 5%. 





106 T. H. Starks anp H. A. Davip 


(2) Since n(t—1) = 30(4) = 120, we may use the normal approximation to find the 
critical value, a,, for the score of the pre-assigned treatment 
a, = 1-64[n(t — 1)/4]! + n(t— 1) +4 = 69-48. 
(3) a; = 70. 
(4) a; > a,. (The actual significance level of a; is 0-0412 under Hj.) Our conclusion is 
that brand 5 is better than the average of the five brands. 


8:2. Test of equality of two pre-assigned brands 

If brand 4 were less expensive than brand 2, we might suppose that there would be 
interest expressed prior to the experiment on whether or not brand 2 is actually superior to 
brand 4. The one-sided test of H): 7,4. = 72, against H,,: 7,. > 74, would proceed as follows: 

(1) Significance level: 5%. 

(2) 1-64(nt/2)}+0-5 = 1-64,/75+0°5 = 14-7, m, = 15. 

(3) @,—a, = 51—24 = 27 > m,. We accept the alternative hypothesis H, that brand 2 
is superior to brand 4. 

8-3. Test of the highest score 

It is natural for the experimenter to wonder whether the highest score a, which in this 
example is the score of brand 3, is actually significantly larger than average; that is, if 
7), > 4. The test of the null hypothesis against this alternative is as follows: 

(1) Significance level: 5%. 

(2) From the binomial tables (Harvard University, 1955), we find for n = 120 and p = 4, 


neal Pr (a; > 74|H;) = 0-033/5, 
and Pr (a; > 73|H4) > 0-05/5. 
Therefore, M, = 74 and £ = 0-033. 


(3) a3 = a = 89 > M,. Our conclusion is that brand 3 is significantly better than 


average. 
8-4. Multiple comparisons 


To separate the significantly different scores into groups, the experimenter might use 
the multiple range test, the least significant difference method, or the method of contrasts. 


Multiple comparison range test 
(1) Significance level: 5%. 
(2) W5,o.05 = 3°86 (obtained from Table 22, Biometrika Tables for Statisticians). 


R* = 4W, 005 V (nt) +} = 3-86 /(150/4) +} = 23-887, 
R+ = 24 < n(t—1)—4n = 105, 
B = Pr[W, > (2R* —4) (nt)-4] = Pr[W, > 3-878] = 0-048 
(from Table 23, Biometrika Tables for Statisticians.) 
(3) GQ, Gy @ Gs Oz 
24 51 66 70 89 


(4) Any two brands whose scores are not underlined by the same line in the above step 
may be considered distinguishably different. 








Le 


of ; 





\y 





Significance tests for paired-comparison experiments 107 


Least significant difference method 
(1) Significance level: 5%. 


(2) D = 4 {22 — ftn%(t— 1)°}/(nt) 
= 4{20,354—18,000}/150 = 62-77, 

Xi. o-0s = 9°488 < D. 

Hence a significant difference exists between the scores. 

(3) 1-96(4nt)}+4 = 1-96,/75+4 = 17-47, 
m, = 18, 

&a Gg G@ Gy as 
24 51 66 70 89 


Any two brands whose scores are not underlined by the same line may be considered 
distinguishably different. 


Method of contrasts 
We have found in Step 2 of the L.s.p. method that a significant difference exists between 
brands using a D-test with critical value 
Doos = Xi,o0s = 9°488. 


Hence the steps in the contrasts method are as follows: 
(1) The separation into groups is accomplished in this case by computing the squares 
of four contrasts. 


Q? = 4(a3—a,)2/nt = 4(89—51)2/150 = 38-5, 
Q3 = 4(a,—a,)?/nt = 4(89 — 66)?/150 = 14-1, 
Q2 = 4(a;—a,)2/nt = 4(70—51)2/150 = 9-6, 

Q? = 4(a.—a,)2/nt = 4(51 — 24)2/150 = 19-4. 


(2) For each contrast 


i=1 
(3) SD, = 2(9-488) = 18-976. 
(4) Qi > Qi > SD, > Q3 > Q8. 
Hence, Ge Gy G@, Gs Gs 


24 51 66 70 89 


Any two brands whose scores are not underlined by the same line may be considered 
significantly different. 


9. SUMMARY 


The tests for paired-comparison experiments presented in this paper have dealt with the 
evaluation of individual treatments, pairs of treatments, the treatment with the highest 
score, contrasts of treatment scores, and the separation of treatment effects. The methods, 





108 T. H. Starks anp H. A. Davip 


in the main, are based on the binomial distribution and its asymptotic approximations. 
These methods are developed under the null hypothesis 


Hy: 7; = 4 for alli, 9 


expressing treatment equality. When the actual null hypothesis of the problem (e.g. 
Hy: 7,,. = 7,., when testing for a difference between two particular treatments) is true and 
H,, is not true, the test based on H is a conservative test of H,, that is, the probability of 
exceeding the critical value is less than it was under H5. 

The three methods developed for the multiple comparison of treatment scores are 
analogous to methods used for the multiple comparison of sample means in the analysis of 
variance. The three methods will not, in general, lead to the same interpretation of the 
paired-comparison data for the same reasons that their analogues in the analysis of variance 
do not, in general, give the same results. (For a discussion of this subject, the reader is 
referred to an expository paper by Duncan, 1955.) Also it is shown that one of these methods, 
the multiple comparison range test, is easily extended for similar analyses of balanced 
k (2 < k < t) treatments per block subjective experiments. 


The authors wish to express their gratitude to Prof. R. A. Bradley for valuable suggestions 
and criticisms. 


REFERENCES 


BraD.tey, R. A. (1954). Rank analysis of incomplete block designs. II. Additional tables for the 
method of paired comparisons. Biometrika, 41, 502-37. 

BravD.eEy, R. A. & Terry, M. E. (1952). Rank analysis of incomplete block designs. I. The method 
of paired comparisons. Biometrika, 39, 324-45. 

CrAme™r, H. (1946). Mathematical Methods of Statistics. Princeton: Princeton University Press. 

Davin, H. A. (1959). Tournaments and paired comparisons. Biometrika, 46, 139-49. 

Davin, H. A. (1960). A conservative property of binomial tests. Ann. Math. Statist. 31, 1205-7. 

Duncan, D. B. (1955). Multiple range and multiple F tests. Biometrics, 11, 1-42. 

Dursin, J. (1951). Incomplete blocks in ranking experiments. Brit. J. Psychol. (Statist. Sect.), 
4, 85-90. 

FEDERER, W. T. (1955). Experimental Design. New York: The Macmillan Company. 

FLECKENSTEIN, M., FREUND, R. A. & Jackson, J. E. (1958). A paired comparison test of type- 
writer carbon papers. Tappi, 41, 128-30. 

Hartiey, H. O. (1950). The use of range in analysis of variance. Biometrika, 37, 271-80. 

HARVARD UNIVERSITY CompuTATION LABORATORY (1955). Tables of the Cumulative Binomial Pro- 
bability Distribution. Cambridge, Mass.: Harvard University Press. 

Pearson, E. 8. & Hartiey, H. O. (1954). Biometrika Tables for Statisticians, 1. Cambridge Univer- 
sity Press. 

Scuerré, H. (1952). An analysis of variance for paired comparisons. J. Amer. statist. Ass. 47, 381— 
400. 

Scuerreé, H. (1953). A method for judging all contrasts in the analysis of variance. Biometrika, 
40, 87-104. 

Srarks, T. H. (1958). Significance tests in experiments involving paired comparisons. Ph.D. dis- 
sertation, Virginia Polytechnic Institute Library, Blacksburg, Virginia. 

THurstone, L. L. (1927). Psychophysical analysis. Amer. J. Psychol. 38, 368-89. 








Prin 





i i ee 





Biometrika (1961), 48, 1 and 2, p. 109 109 
Printed in Great Britain 


Goodness-of-fit tests on a circle 


By G. 8. WATSONt 


University of Toronto and Research Triangle Institute 


1. INTRODUCTION 


To test whether a random sample X,,..., X, has been drawn from a population with a 
specified continuous distribution function, F(x), some measure of the difference between 
this and the sample distribution function, F(x), may be used. Well-known test statistics of 
this nature are 


We=N[" (Fy(e)—Fwydr(a (1) 
and Ky=JN sup |Fy(x)—F(2)|. (2) 


A summary of what is known of these tests and extensions of them is given by Darling 
(1957). The statistic (1) is associated with the names of Cramér, Von Mises and Smirnov and 
the latter with those of Kolmogorov and Smirnov. For practical use, both (1) and (2) are 
available. Neither test has been extended for use when F(x) contains parameters requiring 
estimation. 

In an investigation of the Smirnov statistic, (1), it was observed that while the limiting 
distribution is very awkward and requires electronic computing for its tabulation, the 
distribution of 


ux=NI" [a@)—Fe)-[° vw) Felare| are) (3) 


is very simple and hardly requires tabulation, as will be shown in § 2: This at first appeared 
to be only of academic interest, for although (3) provides a consistent test, it is less natural 
than the tabulated W%.. It will be noted that U?, has the form of a variance while W%, has the 
form of a second moment about the origin, i.e. the modification corresponds to a ‘correction 
for the mean’. Further comparisons are made in § 3. 

The above goodness-of-fit tests refer to cases where the sample space is the real line. 
Turning now to the case where the probability is distributed on the circumference of a circle, 
a difficulty arises in the application of these tests--there is no natural starting point for the 
distribution function and different arbitrary starting points will give the test statistics, (1) 
and (2) above, different values. Kuiper (1960) raises this problem and suggests a variant 
of (2). He supposes that some arbitrary starting point has been used for a circle of unit 
circumference. It is clear that 

Vy = sup {Fy(x)—F(x)}— inf {Fy(x)—F(x)} (4) 

0<z<1 0<a<1 
is independent of the starting point and he shows, using an unpublished result of Darling’s, 
that @ 8 1 
prob {/N Vy < c} = 1— Y 2(4j%e?— 1) e+ 3N° & J7(4j%c? — 8) e-2** +0 (x) . (5) 


+ The main result of this paper was derived in 1959 while the author was a Research Associate, 
Department of Mathematics, Princeton University, and partially supported by ONR funds. 





110 G. 8. Watson 


He checks his results against 200 artificial samples of 10. Kuiper also gives a similar solution 
to the two-sample problem. 

The suggestion of this paper is that statistic (3) be used, with the circumference of the 
circle replacing the real line as the region of integration. For it will be shown in § 2 that its 
distribution is convenient and does not depend on the starting point. Greenwood (1959) in 
an unpublished report made the same suggestion but was unable to give the relevant 
distribution. 

When some definite parametric class of alternative distributions is given, neither test (3) 
nor test (4) should be used since it should be possible to derive a special test that is more 
powerful than these general purpose tests. 

A two-sample test based on an extension of (3) will be given in a further paper and applied 
there to some real data. 


2. ASYMPTOTIC DISTRIBUTION OF U%, 


It is first necessary to show that (3), when adapted to the circle of unit circumference is 
independent of the point chosen for starting to cumulate the distributions. Let d, F denote 
the hypothetical probability element at x. Suppose there is another probability distribution 
on the circle and that G(x; 2) and G(x; x,) are its cumulative distribution functions begun 
respectively at x = x, and x = x,. All these functions are naturally of period 1. 

Taking 0 < x,—2, < 1, it is clear that 


G(x; %) = G(x; x)+G(x1; 2%), 4% < eX < 4X+l, ‘ 

= G(x; %)-1+G(x%3%), t+1<ae<aytl. " 
1 1 1 

Hence G(x; x)-| G(x; %))d,F = G(x; #)-| G(x; x)d.F + |" d,F, (7) 
0 0 Zo 


where x, < x < x +1, and for z7+1< x < 2,+1 the right-hand side of (7) is decreased 
by one. Thus, if there is yet another distribution on the circle with cumulatives H(z; 25), 
H (x; 2), (7) will again be true with H replacing G. Hence 


{G(a; 2%) —H (x; xy} | {G(x; 2%) —H (x; x))}d,F 


is independent of the value of x) which may be taken as zero for convenience. Identifying the 
G-distribution with the sample distribution Fy(x) and the H-distribution with the hypo- 
thetical distribution F(x), we may now write, without ambiguity, 


Ux = NI {Fy(e)-Fle)-[ is(e)-Fey)d,F) ,F, 


where d,F =dF(z). 


Since there is no loss of generality in assuming the hypothetical distribution to be uniform, 
the canonical form of our statistic is 


2 
Ui, =N | [Fy(u)—u— i [Fu(y)—9] ay| du, (8) 


To find the limiting distribution of (8) we follow the approach of Doob (1949) and define 


Zy(u) = UN {Fy(u)—u— i Fy) —9] ay}. (9) 











For 


Sin 


ha 











Goodness-of-fit tests on a circle 111 
For 0 < u, v < 1, it may be shown that 


cov {Zy(u), Zy(v)} = min (u,v) — Hw-+0) + Hu—v)2+ zy. 
Since var {Z(u)} = 7, it is convenient to write 
Xy(u) = (12 Zy(u), 
hence for —1 <7 <1, 
cov {Xy(u), Xy(w+7)} = 1-—6|7|(1—|7]). (10) 


1 
Thus 120%, = | X y(u)? du. (11) 
0 


As N + co, Xy(u) may be taken as a Gaussian process X(w) with zero mean and a covariance 
function given by the right-hand side of (10) which we will call p(r). 

The asymptotic distribution of 12U%, may now be derived by the methods of Kac & 
Siegert (1947a). They are merely a generalization of the usual methods for obtaining the 
distribution of the sum of squares of a finite number of multivariate normal variables with 
zero means. Thus we need the eigen-solutions of 


[, ete—n mae =Ah(s) (0<s< 1). (12) 
0 


Inspection shows that A = constant, A = 0isasolution. By splitting the range of integration 
at ¢ = s and differentiating (12) with respect to s, it may be reduced to 


12 
h"(s)+ 1 h(s) = 0, 
(13) 
} h(s)ds = 0. 
0 ) 


The second condition of (13) is required to give orthogonality of all the solutions. Thus we 
have h(s) = constant (A=0), 
h(s) = ws te ‘i ae) (14) 
h(s) = sin 2m7x mr 
Hence 12U%, is distributed asymptotically as 


a 
1202 = —_ (q?_ +.B? 15 
- Ps m7 (Am * m)> ( ) 
where @,,, b,,(m = 1,...,00) are independent sequences of independent standard normal 
variables. If w,,(m = 1, ...,00) is a sequence of independent random variables with the same 
density functions e~”, then ee 
= > 


——- W,,. 16 
matzmn ™ on 


Thus the moment generating function of U? is given by 


o f @ -1 
002) — pene See 
Be) = Hh (sp) an 
sin ,/46]-1 
- [Si] (18) 


by the infinite product for the sine function. Thus 


1 
m=1 ~ - (0/2m?n*) sid 





112 G. S. Watson 





where a aie 1— (0/2m2n?) 
it 0->2m*n* sin 30 
W340 

i.e. Cm = (—1)"*2. (20) 
Inverting (19) with this value for c,,, the density function of U? is found to be 

P(v) = Y (—1)"-12. 2m2n? e—2m*n*e, (21) 

m=1 
Thus prob(U2 > v) = DS (—1)™-12¢-2m*a*o (22) 
m=1 
= Me" a e—87?v < e—187?v = oy a 


which is seen to converge very rapidly. (22) is thus the limiting distribution of U3, as N -> oo. 
It does not contain, as (5) does, any correction terms for finite n. Both solutions have a form 
which is characteristic of distributions obtained in this area. The distribution (22) has arisen 
before in noise theory—see e.g. Kac & Siegert (19476) and Lampard (1956). Even more 
interesting is the fact that (22) is essentially the asymptotic distribution of K},; to be precise, 


lim prob {Ky < 7,/v} = lim prob{U}, <v}. 
N->o© N->o@ 


It seems very surprising that the asymptotic distributions of Ki,7~* and U%, should be the 
same. The distribution (22) has been tabulated by Smirnov (1948). The same paper gives an 
alternative form for (22) which is useful if v is very small. 

The reason why statistic (3) is so much easier to handle than statistic (1) is of some 
interest. It will be recalled that in the distribution theory of the serial correlation coef- 
ficients, the circular redefinition of the coefficients introduces circulant matrices and usually, 
after elimination of the mean, the remaining latent roots are paired. Precisely the same 
things happen here. The covariance function (10) is, for 0 < 7 < 1, symmetrical about 
t = 4, the odd root is eliminated (i.e. is zero now) and the remainder are paired. There is the 
further simplification that their pattern is such that the moment generating function is an 
elementary function. For the case when F(x) involves parameters requiring estimation, 
there is the double difficulty that the appropriate eigen-values are in general simple (making 
the inversion difficult) and, in general, extremely hard to find. Again, no way has been found, 
along these lines, of overcoming these difficulties. 


3. ComMENtTs on U%, 


The statistic 7%, arose in another connexion, was observed to have a simple distribution 
and then its relevance to the goodness-of-fit test on the circle was noted. It is possible, 
however, to introduce U%, in a way that may be more attractive to the reader. Using the 
notation of the beginning of § 2, suppose we wish to choose rationally some starting point 2, 
on the circle in order to make the goodness-of-fit test based on W%,. Let us write 


1 
Wa») = N | “(Eyles te) — Fe; 29) P. (23) 


One immediate suggestion would be to choose x, to make W3,(z,) a minimum. From (6), with 
A(x; x) = Fy(x; %)— F(x; x9), 


we derive A(x; Xp) = A(x; 2) + A(x; 2p). 




















Th 


is t 























Goodness-of-fit tests on a circle 
1 
Thus min W%(z,) = min v{ {LFy(a; 0) — F(x; 0)]—A}?d,F, (24) 
Le A 0 
where A is a constant. But the right-hand side of (24) is trivially shown to be 


1 1 
Ui, =N | ' [Fv(e) — Fa) - | UFy(y)-FWldy Fd, F 


where, as in § 2, Fy(x) = Fy(x; 0), F(x) = F(x; 0). 


Thus U%, = min W3,(2,). 
Lo 
This result is also evident from the computing forms of the statistics. For the ordered 


sample 
P Xy) < Xe < ... < By), write, with F(x) = F(x; 0), 


v; = F(x). 
N 2i-1\2 1 
w2 ae poe ef Ee St —— 
_ w(0) = 2 (x 2N ) + Ton (25) 


is the usual form for W%,. To find the form for U3, a similar method may be used with the 
addition of the result that 





1 114 
‘ i [Fy(x)— P(x)]d,F = 5-5 x F(x), 
» 0 i=1 
= 4% 
Hence it may be shown that 
| en ee (26) 
= 2 (%—“gy —?+4) + yoy 


Since 3 — } is the arithmetic mean of v;—(2i—1)/2N (¢ = 1, 2,...,.N), WR, and U3, are again 
seen as respectively a sum of squares about the origin and of squares of deviations from 
a mean. 

Actually it may be easier to use, instead of (26), the equivalent form 
N N 9- 

T= Bet 2 dap FHM O— a 2) 
Since we are only concerned here with the circular problem, there is no point in a comparison 
of W%, and U%,, but only between U%, and Vy. This latter comparison should be made with 
respect to their powers but no such information is known. For practical applications it is 
necessary to use the asymptotic distribution (22), since the distribution of U3, in small 
samples is not known. The merit of this approximation is untested at present. Assuming 
that it is satisfactory, the method of computation of U%, from (27), and the derivation 
of an approximate significance level from (22) are very simple so that no artificial 
example is called for and the only real data in the author’s hands concerns the two-sample 
problem. 


The author is grateful to the Editor for some helpful comments. 


Biom. 48 


114 G. S. Watson 


REFERENCES 


DaruineG, D. A. (1957). The Kolmogorov-Smirnov, Cramér—Von Mises tests. Ann. Math. Statist. 
28, 823-38. 

Doos, J. L. (1949). Heuristic approach to the Kolmogorov—-Smirnov theorems. Ann. Math. Statist. 
20, 393-403. 

Kac, M. & Stecert, A. J. F. (1947a). An explicit representation of a stationary Gaussian process. 
Ann. Math. Statist. 18, 438-42. 

Kac, M. & Srecert, A. J. F. (19476). On the theory of noise in radio receivers with square law detectors. 
Jour. Appl. Phys. 18, 383-97. 

Korerr, N. H. (1960). Tests concerning random points on a circle. Proc. Koninkl. Nederl. Akad. Van 
Wettenschappen, Series A, 63, 38-47. 

Lamparp, 8. G. (1956). The probability distribution for the filtered output of a multiplier. I.R.2Z. 
Trans. Professional Group on Information Theory, IT-2, 1, 4-11. 

Smirnov, N. V. (1948). Table for estimating the goodness of fit of empirical distributions. Ann. Math. 
Statist. 19, 279-81. 














Pri 

















Biometrika (1961), 48, 1 and 2, p. 115 115 
Printed in Great Britain 


The use of orthogonal polynomials of the positive and negative 
binomial frequency functions in curve fitting by Aitken’s method 


By H. T. GONIN 
University of South Africa 


1. Introduction. Frequency data not far from the normal can be fairly represented by a 
type A series, which often is a normal function plus a corrective term involving the third 
Hermite polynomial. Frequency data thought to be nearly of Poisson type, but with certain 
perturbations away from that, can correspondingly be fairly represented by a Poisson 
function, plus a corrective term involving the second Charlier polynomial for the value of 
the mean in question, most easily achieved, however, by taking the second backward finite 
differences of the Poisson fitted class frequencies, multiplying these by a certain easily com- 
puted constant, and so correcting the Poisson class frequencies by (usually) small amounts. 

In this paper we consider the graduation of a set of data using an expansion similar to 
the Gram-Charlier type B expansion but based on (i) the positive binomial, (ii) the 
negative binomial. The form thus assumed is y, = {a)+4,G,(x) +a,G,(x)+...} A(x), where 
the G’s are an orthogonal set with respect to ¢(x) (which is binomial) and where the 
problem is to determine ‘best’ values of a ,a,,...,etc. The index n and the fraction p 
of the binomial are assumed known, or are estimated from the first two moments of the 
data. We are mainly concerned with a fit to a set of data of nearly binomial form, and the 
problem of efficiently estimating the parameters for a given Gram—Charlier type B model 
is more complicated and of a somewhat different nature. 

The method used is that of weighted least squares, the weights being ¢(x)-!. In the case 
of many data, such as, e.g. the number of deaths 0, in a mortality table, the weights are 
proportional to the inverse of the square of the standard deviation and approximate to 
y;' (in our case we use ¢(2) as a first approximation to y,). The scheme followed is that 
introduced by Aitken (1932-33) in the unweighted rectangular case where the orthogonal 
polynomials are those of Tchebycheff. 

In the latter case Fisher’s scheme is perhaps just as simple, but the tabulation of the 
polynomials in the binomial case introduces a serious complication. The values of the 
orthogonal polynomials would have to be tabulated, in the case of the positive binomial, 
e.g. for two parameters, the first being the value of p (and this, for fine grading, would need 
hundreds of values), the second the number of data. By using Aitken’s method the correc- 
tive terms emerge as simple by-products of the factorial moments ad so the work is quite 
economical. 


2. Aitken’s method of fitting in the weighted case. Let us fit a series of observed frequencies 


oe Ove Ya = {Ay G(x) + a, G(x) +... +a,G,(x)} A(z) (1) 
by the method of least squares, with weighting function ¢(x)"". 
By Legendre’s principle 


S? = ¥ d(x)“, — {an Gy +a, G, +... +4,G,} b(x)? 


must be a minimum. 





116 H. T. Gonrin 
Hence 0S?/da, = 0. Thus 


a, > $(x) Ge(x) = LO, G,(2). (2) 
Let G,(x) = b5 + bf Ly + bb 2%) Feo tT bf 2%, (3) 
ao 


where Ly) = and 2 = a(%—1)...(a—r+1). 


rt 
We multiply (3) by O, and sum over all the values of x, obtaining 
>> 0,G,(z) = bp >» 0, + bj >> 0, % Feet by >» 0%) 


= bpm + bi my + ... + O7 mG), (4) 


where m, = XO,x,), the rth reduced factorial moment of the data. 

Now (3) is really a Gregory-Newton interpolation formula. Hence the coefficients of 
1, xy, ---, Xq of the polynomial G(x), namely 0%, bj, ..., bf, are the terminal values G,,(0), 
AG,(0), ..., A"G,(0). (The indices of the coefficients indicate to what degree of polynomial 
they belong, thus b% denotes the coefficients of 2, in G,(x).) 

This simple fact, noticed and turned to use by Aitken, is of great computational value 
in his scheme. From (2) and (4) we have 


a, © $(2) G2(x) = bpm + Of mg + ... + Of mG. (5) 


Hence the constants a, can be computed by combining the various coefficients of 1, 2), ...,%) 
of our polynomials with the reduced factorial moments of the observed frequencies and 
dividing by } ¢(x) G?(zx). 

x 


In the Aitken table the (j + 1)th column will be 6}, bj, ..., 6} and the (i+ 1)th row will be 
bi, bi+1, ..., bf, if constants ap, a,, ..., a, are required. The (j + 1)th column therefore merely 
consists of the coefficients of G;(x) written in the form of reduced factorials, the column 
starting with the constant. 

Again we have from equation (1) that 


500) = Ay Gy(0) +4, 4,(0) + agG,(0) + ... +a,G,(0), 
A500) ts a, AG,(0) +a,AG,(0) + ... +a, AG,(0), 
Ar 510) -_ a, A’G,(0), 
a 40) = ayb8 + a,b} + a,62 +... +a,b8, 
40) 7 a, b}+a,b3+...+a,bi, 
Ar_Yo. = a,bF. 




















8a! 


bi 














Use of orthogonal polynomials of the binomial in curve fitting 117 


The whole graduation therefore falls into four easy steps: 

(1) Compute the reduced factorial moments by repeated summation. 

(2) Combine these according to multipliers in the columns of coefficients of 1, xq, ..., 2%) 
of the orthogonal polynomials G(x) and divide by X¢(x) G?(x), the values set out at the feet 
of the columns. 

(3) Combine the a, values obtained by step (2) according to multipliers in the rows of the 
same table, thus obtaining y,/¢(0), A[yo/f(0)], ..., A”[yo/d(0)]. 

(4) Complete the difference table by summation to obtain the other graduated values of 
Yx/9(2). 


3. The orthogonal polynomials of the positive and negative binomials. For the positive 
binomial (") p*¢" *, the orthogonal polynomials are given by 


G(x; n,p) = (1+ pA)-"+"-12, symbolically 
= 4 —r(n—r+1)pae-Y+ (:) (n—r+ 2)? pra) + ...4(—1) np’, (6) 


where x = a(x—1)...(~—r+1) (Aitken & Gonin, 1935). 

The explicit form for the polynomials of the negative binomial 
(%)x A ¥ - ie = = (x) 

x! \a 147) , Where (a), = a(a+1)...(a+x—-1) = (a+2-1) 
was first given by Shenton (1950) and the symbolic form by the same author in 1958. The 
polynomials were also fully studied by Wiid (1958). They are, in Shenton’s notation, 
given by 
AA 


a+r—1 
G(z; A,a) = (2 -5) x” (A and a > 0) 
A r 7 a 
= 27 - —1)-2-) — 1)2— alr-2) = —1])\o— 
wM—r(a+r 1) a +()@+r 1) ae +...4(-—1)'(a+r-1) af 
(7) 


All the relationships between the frequency functions, polynomials, factorial moments, 
generating functions, etc., for the negative binomial can be derived from the positive 
binomial by putting p = —A/a and n = —a, e.g. 


Positive binomial Negative binomial 
‘ Ar 
(a) Factorial moments % (x) a ni pr (ae 
A’ A\" 
(b) =X (x) G2(x) rin) pt gr r!(a),— {1+ - 
z ar a 
‘ : t" A r A az 
(c) Generating function Xx G,(x) —- (1+qt)* (1—pt)"-* 1+1+ = 1+ my 
ea ! 


4. Fitting by the polynomials of the positive binomial. (a) A fitting by the symmetrical 
polynomials where p = q = 4 was done by Greenleaf (1932) and Gonin (1945), the latter 
using Aitken’s method which proved to be much quicker than Greenleaf’s, especially when 
the polynomials were expressed in terms of central and mean central factorials. Now if 
we change the polynomials given by (6) in such a way that the constant term is unity, and 
denote these polynomials by G/(~; n, p), we can easily prove that 

(*) Gi(x; n, 4) = H%(x), where H"(x) = Ar C. 


«“—r 





118 H. T. Gonrn 


the elegant orthogonal functions introduced by Milne and Stone (see Milne, 1949) and used 
by them for graduating data. The function H7}'(x) has this peculiar orthogonal property 
(in Milne’s notation) - 

SANs) HP(k)=0 if 3 +t, 


=2" if s=t 
and the recurrence relationship 


A,,,(8 +1) = A(s+1)—A),,(s) — As). 


A table of H?'(x) is easily computed, the first row in the table being binomial coefficients and 
the first column being all unity. By using the recurrence formula the table is quickly 
completed and can be used following Fisher’s method, since if we graduate by 


ve , , n 
Ye = {ay (2) +a, G(2) +... +a,G4(e)} (") yay" 
this is equivalent to graduating by 


1 
Yo = xq (to Ho (x) +0, H} (x) +... +a, H7(a)}. 


(b) If p + q, it is best to graduate by Aitken’s method. Let us graduate Weldon’s data 
in which he records the fives and sixes in 26,306 throws of 12 dice. These data have been 
graduated by Fry (1928) by various curves. 

We first compute the reduced factorial moments in the usual way by summing from the 
bottoms of the respective columns, stopping one item from the top each time, obtaining 
My = > O,x4). We obtain 

xz 


Mo) = 26,306, ma) = 106,602, mg) = 198,184, mg) = 223,524. 


The values are entered at the side of Aitken’s table. The first three polynomials for n = 12 
and p = } are x—4, 22%) —220+ 44 and 62g) — 20a +142%x — 440. 
The values of ¥ ¢(x) G2(x) are entered at the feet of the columns 
x 











, 0 1 2 3 
a, 26,306 516-75 33-8523 1-5290 
me, 26,306 G,(0) rh ee — 440 
mm) 106,602 AG,(0) 1 _22 119 
Mm 198,184 A2G,(0) ’ ; 2 ~20 
mug) 223,524 ASG,(0) , . 6 
Xp(x) G2(x) 1 $ A 2 
(ay = 26,306, 
a, = (106,602 — 4 x 26,306) +§ = 516-75, 
dy = 198,184(2) + 106,602( —22) + 26,306(44)/352, 


= 33-8523, 
ag = 223,524(6) + 198,184(— 20) + 106,602(242) + 26,306( —442)/2040 
= 1-5290. 











The 


us 


fre 








Use of orthogonal polynomials of the binomial in curve fitting 119 


Therefore 

f 510) = dy +a,G,(0) +a,G4,(0) + a, G,(0) 
= 26,306 + 516-75(—4) + 33-8523(44) + 1-5290(— #42) 
= 24,660°75, 

A300) = 516-75 — 33-8523(22) + 1-5290(222) 
= 324-56, 

Ar) = 37:12, 
3_Yo _ 
— 





A good check on the polynomials is to enter the theoretical reduced factorial moments 
np" |r! at the side of the table instead of the mj), then all the ~, values must be zero 
except dp. 

By completing the difference table we obtain y,/¢(x) and multiplying by ¢(x).we obtain y,. 


Empirical Obrerved Graduation 

~ Yo Yz (graduated) graduation data by ¢(2) 
(x) by Fry 0, alone 
0 24,661 190 190 185 203 
1 24,985 1,155 1,185 1,149 1,217 
2 25,347 3,224 3,223 3,265 3,345 
3 25,755 5,461 5,459 5,475 5,576 
4 26,218 6,252 6,252 6,114 6,273 
5 26,746 5,100 5,102 5,194 5,018 
6 27,348 3,041 3,043 3,067 2,927 
7 28,033 1,336 1,337 1,331 1,255 
8 28,810 430 429 403 392 
9 29,688 99 98 105 87 
10 30,677 16 15 14 13 
11 31,786 1-6 1-4 4 1 
12 33,623 0-1 0-1 0 0 


There is a very close agreement with the empirical graduation of Fry, in which he 
uses the formula 
d(x) = ee (0-339325)* (0-660675)!2135—2, 


There is an objection, mentioned by Fry, to the empirical fitting since we know that n 
is 12 and not 12-135. His fitting is, however, better than ours as he has one more degree of 
freedom in the y? test. Our fitting is, nevertheless, very satisfactory and a great improve- 
ment on the graduation by ¢(x) alone. 


5. Graduation by polynomials of the negative binomial. We graduate by 
Ye = [a + ay G(x) +t a, G,(x)] P(x), 


where ¢() is the negative binomial frequency function already defined. 
The a, values are obtained from the relationship 


XG,(x) Oz = 4, (x) Ge(z), 








120 H. T. Gontn 








r r 

si Ep (x) G22) = ria), Atay 
_NMy_, ._ # 

and A= N = @, ee 


using moment estimators, where Z is the mean of the distribution and s* is the second 
moment with zero origin. 


We choose an example based on the data given by Hald (1952, p. 730) and graduated by 
him by means of a negative binomial. The reduced factorial moments are first calculated. 


Distribution of 647 women according to number of accidents. 


(The values in bold figures are the reduced factorial moments) 


0 447 647 

1 132 200 301 

2 42 68 101 143 

3 21 26 33 42 53 
4 3 5 7 9 11 
5 2 2 2 2 2 
6 0 0 0 0 0 


A = % = 394 = 0-4652, s? = 0-6919, agreeing with Hald. 
Therefore a=0-9546 and A/a = 0-4873. 
Since the parameters are estimated in this way, it follows that a, and a, will both be zero 
and we need only compute @,(2) and x (a) G(x). 
Gx(x) = 6a — 6(2-9546) (0-4873) az + 3(2-9546)® (0-4873)2 a4) 
— (2-9546)® (0-4873)8 
= 62g) — 8°63872%) + 4°11402q — 0-6378. 
E Pla) G2(a) = 3! (2-9546)® (0-4873)3 (1-4873) = 12-5902. 
z 


Hence our table becomes 














r 0 1 2 3 
a, 647 . ‘ — 7-2816 
647 F 1 : : — 0-6378 
= m,, 301 . 5 ; , 4-1140 
143 ‘ F ; ‘ — 8-6387 
53 : , : ” 6 
Xd(x) G2(x) ‘ 1 . : 12-5902 
The differences are 
Yo _ 659 A-%o_ — _ 30 A2—4o_ _ 63 As-Yo. _ _ 44 
(0) * ~ $(0) " ~ $(0) * ~ $(0) 


By building up a difference table we obtain the values y,/¢(x). Or we can use 





Yu x(a—1) x(a—1)(a—2) 
agi 7 Oe TT 63 — 31 44, 





























Use of orthogonal polynomials of the binomial in curve fitting 121 


The graduation results are 


Hald 
Ya Ye (negative 
x O,(observed) (5(x) (graduated) x? binomial) oa 
0 447 652 446-2 0-0 442-9 0-4 
1 132 622 133-1 0-0 138-5 0-3 
2 42 655 44-9 0-2 44-4 0-1 
3 21 707 15-6 1-9 14-3 3-1 
4 3 734 5-2 0-9 4:6 0-6 
5 2 692 1-6 0-1 1-5 0-2 
>6 0 537 0-6 0-6 0-7 0-7 
Totals 647 : 647-2 2-7 646-9 5-4 
D.F. = 3 as. = 
P = 0-45 P = 0:25 


The graduation is obviously excellent and an improvement on Hald’s, even allowing for 
the fact that there is one degree of freedom less. The parameters were also calculated by the 
method of maximum likelihood, using the method of Haldane (1941), and the corresponding 
polynomials constructed. The results showed little improvement on that of our method 
and we lost the advantage of having a, and a, zero, so that all the polynomials had to be 
constructed. 

We examined about five other distributions which were graduated by the negative 
binomial. In every case the use of the third polynomial brings about a good improvement 
in the graduation. In the positive binomial case the integer value of 7 is usually known and 
this should be used and not that obtained from the method of moment estimators. 

Shenton has another method of estimating the parameters A and « by using orthogonal 
polynomials up to the third degree for 0¢/0A and 0¢/da This involves much arithmetic 
and our method of using the third polynomials with moment estimators has more or less 
the same effect. 

There is another objection which is often brought forward against a graduation by a 
series of polynomials. It is said that meaning of the parameters such as they occur, e.g. in 
ecology, are disturbed. The question arises whether such a graduating series is entirely 
empirical or is there a justification for its employment. In the case of the positive and 


negative binomial there is such a justification. Consider the frequency function (") fe. 


Let p’ and q’ be the estimated parameters and p and q the true parameters. Then if p’ and 
q’ deviate slightly from the true parameters, we have p = p’+e and q = q’ —e, where € is 


a small quantity such as 0-005, which may, for example, be caused by wear and tear of the 
dice. Hence 


n : —e — ” , z (gq! — e\(n—2z) 
(*) ora = (1) (+e 0) 
(") ie i (: . <) ( \"* 
— 2£ P q p' q’ 


ms n 'a7!n—-x (1 + “al (2 aad -) 
(") qd py rd . 


n ti’ 
But (L+q't)?(1-p't)" = ¥ G(x,p') 5, 
r=0 : 








122 H. T. Gonin 
(see Szegd, 1939), where G(x, p’) is the rth orthogonel polynomial of the weighting function 


(") p'*q”-”, Hence 





M\ ncqn—e — (”) »’xq'n-« ‘+ ¢ G(x Pb i (x ‘\+ cn ae (x ’) 

x Pq ~ Ne P°q Pq W(X, 2! pq’? g\X, p oee n! p'™q’™ ni, P )}. 
If ¢ is small and p and q are finite values of the order 1, then the coefficients e”/r! p’"q'" are 
small and we may find that the series converges quickly and that the use of polynomials 
up to the 3rd degree already give a very good approximation. 


6. Construction of three polynomials in discrete cases where explicit forms for the poly- 
nomials are not available. The author has found it convenient to express the polynomials 
in factorials of x instead of powers of x. If d(x) is the frequency function, the rth polynomial 
is given by 


| 1 eo) je a” | 
G(x) = | Ep(x) a Epa) A... Epa |, 
d(x) aa T(x) aM D(x) aa 
| Lp(a) aD L(x) a"—Ve 2. Lp (x) al Var 
where P® = |p| (8 = 0,1,...,7; t = 0,1,...,7) 
and Mes) = = $(@) eoKs) 


= Y (x) e(a—r+r) 


= L(x) x9 +7.8F P(x) arte-D + (3) 8 Y h(xr) afr t2- 
Zz x = 
(using Vandermonde’s expansion) 


r 
= Mt FP -SMire—y + (;) 8 Wirt g—2) + ++ 


This expansion of moments of the type } ¢(x) «a into a series involving factorial moments 
Zz 


has been given by Guest (1953). 


I should like to thank Prof. A. J. B. Wiid, lately of the Council of Science and Industrial 
Research, for his valuable assistance in drawing up this paper, and Mrs H. E. Rudolf, also 
of the C.S.I.R., for her technical assistance. Further, I wish particularly to thank the 
referees who, through their helpful remarks, have suggested an appropriate introduction 
to the paper and also other improvements. 


REFERENCES 


AlTKEN, A. C. (1932-33). On the graduation of data by the orthogonal polynomials of least squares. 
Proc. Roy. Soc. Edinb. 53, 54-78. 

ArrKen, A. C. & Gontn, H. T. (1935). On fourfold sampling with and without replacement. Proc. 
Roy. Soc. Edinb. 55, 114-25. 

Fry, T. C. (1928). Probability and its Engineering Uses. New York: D. van Nostrand Co. Inc. 














‘oc. 

















Use of orthogonal polynomials of the binomial in curve fitting 123 


GusEst, P. (1953). The Doolittle method and the fitting of polynomials to weighted data. Biometrika, 
40, 229-31. 

Gontn, H. T. (1945). Curve fitting by means of the orthogonal polynomials in binomial statistical 
distributions. Trans. Roy. Soc. S. Afr. 30, 207-15. 

GREENLEAF, H. E. H. (1932). Curve approximation by means of functions analogous to the Hermite 
polynomials. Ann. Math. Statist. 3, 204-55. 

Hap, A. (1952). Statistical Theory with Engineering Applications. New York: John Wiley and 
Sons, Inc. 

Hawpang, J. B. 8. (1941). The fittings of binomial distributions. Ann. Eugen., Lond., 11, 179-81. 

Mung, W. E. (1949). Numerical Analysis. Princeton University Press. 

SHENTON, L. R. (1950). Maximum likelihood and the efficiency of the method of moments. Biometrika, 
37, 111-16. 

Suenton, L. R. (1958). Moment estimators and maximum likelihood. Biometrika, 45, 411-20. 

SzeG6, G. (1939). Orthogonal Polynomials. New York: Mathematical Society Colloquium Publica- 
tions, 23. 

Wip, A. J. B. (1958). Afleiding van die ortogonale polinome van die negatiewe binomiale en nega- 
tiewe faktoriaalbinomiale verdelings. Tydskr. Wet. Kuns. Bloemfontein. Nuwe Reeks Deel 18. 














Pr 


re 


di 











Biometrika (1961), 48, 1 and 2, p. 125 125 
Printed in Great Britain 


The estimation of regression and error-scale parameters, when 
the joint distribution of the errors is of any continuous 
form and known apart from a scale parameter 


By A. M. W. VERHAGEN 


Division of Mathematical Statistics, Commonwealth Scientific and Industrial 
Research Organization, Melbourne, Australia 


INTRODUCTION AND SUMMARY 


Pitman (1938) discussed the estimation of location and scale parameters of a continuous 
population of any given form, when n independent observations are available. 

In this paper, Pitman’s estimation theory and his notions, ‘proper region’, ‘fiducial 
function’ and ‘the estimator property’ are extended to include the situation where all 
regression parameters and an error-scale parameter in multiple linear regression models 
are to be estimated. A further generalization is that the errors need not have independent 
or identical distributions. It is shown that the joint-fiducial functions may be used to give 
joint and individual confidence regions for the parameters. ‘Closest’ and ‘least-mean- 
square-error’ estimators for the parameters are expressed in terms of the fiducial function 
and the general theory developed is illustrated with the case of normal independently 
distributed errors. 


PROBABILITY RELATIONS BETWEEN OBSERVATIONS AND PARAMETERS 


Fiducial functions 
Consider the regression model y = XB+ou, 


where X is the n xr regression matrix of rank r, 8 is the vector (A), ...,8,) of unknown 

regression coefficients, o is the unknown positive error-scale parameter. The co-ordinates 

Uy, ...,U,, of the vector u are assumed to have a completely known joint distribution F(u) du. 
For any given vector-set P such that 


{f ne =p 


the relation r= Sa (1) 


between the random variables y,,...,y,, and the unknown parameters /), fs, ...,/, and 7 
is satisfied with probability p irrespective of the true value of the parameters. 
The relation (1) will be studied with the aid of subsets A(y) defined as 


Aty) = (2% t (2) 


:0<8<0,-wo<b, <0, v=l.,...,r}, 
where s takes all values in the range (0,00) and the b, (v = 1,...,7) take all values in the 
range (—00, 00). Each vector y has such a set A(y) associated with it, but vectors y in A(y) 


+ The brace notation {:} indicates the set of vectors before the double point, satisfying the con- 
ditions behind it. 








126 A. M. W. VERHAGEN 

all lead to the same set A(y) (A is mnemonic for association). The sets A(y) define a sub- is 

division of the space into mutually exclusive sets, and this subdivision defines a subdivision Wy 

of P into mutually exclusive sets of the form Pn A(y). in 
The (unconditional) probability that (1) is satisfied may therefore be expressed in terms i.e 


of conditional probabilities given the spaces A(y). To bring out the relationship between 
(1) and the conditional probabilities given A(y) explicitly, a change of variables is needed 
such that some of the new variables describe the vectors in A(y) and the others are constant wl 
within A(y). For this purpose the definition of A(y) suggests writing y as a sum of two vectors, 
one in the space spanned by the columns of the matrix X and one in the space spanned by 
the columns of a matrix X. (The square matrix has rank n, X has rank r and X hasrank n —r.) 























This gives tog . sat Hi 
Y1 i 2% t 
ri 
Yo 2 AP 
y=] | =(Xio}} : [+(0:X]] : | = cx: Riz, (3) 
Yn ba! | en w 
where 2,, ...,2, and z,,4, ...,2, are co-ordinates of y with respect to the columns of X and X. 
Hence 2,—Aa] 
1 1 
o 
: th 
” aa = r ol 
el Its = [X:X] 25") nf 
% Zrt4 ~ 6 
o 
; Ee. 








The change of variables from y to z has Jacobian |X: X| + 0. Let y® stand for a definite 
vector y and 2° its transform according to (3). Then the change of variables from 2,,..., 2, 














to 8, by, ...,b,, Wy, «++; Wp_py defined by ir 
: 
4—fy _21-h 
o - 
t] 
z,—f, ze — 6, 
Co s” 
i 
z z R , 
Eg nee ee — COS Wj, , (4) 
Co 8 | a 
0 
2n-1 2n-1 Rk. . 
2} — 1 = —sgin wv, sin w,... cos w,_,_ g 
o 8 8 1 2 n—r-1) 
} 
 & R. ; . 
t= =—sin wy, SiN wW,...8iN W,_,1 
Co 8 8 g 








3) 





The estimation of regression parameters 127 


is such that the new variables 5,,...,b, and s describe variations within A(y®) and 
W4, .--, Wn_y-1 are constant inside A(y°). The required transformation has been achieved 
in two steps and the Jacobian is y(w) (1/s)"*1 (w stands for w,, ...,w,_»). For w constant, 
i.e. y © A(y®), the change of variable (4) may be written 
z—B 2z°—b 
mp yt vege § 


(5) 


which with the aid of (3) becomes 
y-X® y°—Xb 
Set > Tr (6) 


Hence the integral | | | F (° - =F) d (* — =e) (7) 
(y—Xg)/oc P o oT 


transforms to 


If. aw | {ff e-mmcrasye PH (= =) db, as] wanes 











where K(w) stands for 


1 y°- *) -1 
i —_—— 1Hh,,...,0h. del. 9 
_— (ys 81 ( 8 : ~ (9) 


Formula (8) gives the probability that (1) is satisfied, and the inner integral in (8) gives 
the conditional probability that (1) is satisfied given y < A(y®). 

Probability relations between parameters and observations for the case of n independent 
observations from a population of the form f((y—,)/0) in which /, is an unknown location 
parameter and o an unknown scale parameter were studied by Pitman (1938). It becomes 
a special case of the theory considered here by taking r = 1 and 


ad ee he 


(i.e. errors are independent and have identical distributions). 











The function K(W) » y°-Xb 
gnrtl s (10) 
in the inner integral of (8) then reduces to 
K(w) ,(yi-h1)\ »(y2—21 (Va 
of) (4) gM). 0 


Pitman (1938) calls (11) the fiducial function for £, and c. By analogy (10) may be called 
the fiducial function for /,, ..., 8, and o. 


Proper regions 
The integral I(y°) of the fiducial function corresponding to an observed vector y® over 


a region of the type o_ 
VX? — Pn Aly) (12) 


gives the conditional probability that 
Yo X8 & Pn Aly’) (12a) 
given y < A(y®). 





128 A. M. W. VERHAGEN 


Thus with integration regions of the form (12) it is possible to evaluate the conditional 
probability of relations of the type (12a) being satisfied. Regions of the form (12) are called 
‘proper regions’ a term borrowed from Pitman (1938). If P is such that the conditional 
probability of (12a) equals p for all possible spaces A(y) then the unconditional probability 
is also p. 

If the fiducial function is integrated over regions other than those of the type (12) but 
such that the integral equals p, then there is no guarantee that the statement arrived at 
by replacing y® by y and (b, s) by (8, @) as in (12) and (12a), is satisfied with probability p. 
Pitman (1957) discusses the shortcomings of the Behrens—Fisher test in relation to ‘proper 
regions’. 


SIMULTANEOUS AND INDIVIDUAL CONFIDENCE SETS FOR THE PARAMETERS 


For a given space A(y°) the integration region defined by (13) 


sc S[A(y*)], 
ee 


(13) 
6, = BlAty)] (v= 


is proper since it may be written in the form (12) by defining 


PaA(y*)= 





o_ 
‘ =e © S[A(y”)], 6, < BA(y*)], v = 1.7}. 


Hence by integrating the fiducial function corresponding to y® over the region (13) we 
obtain the conditional probability that 


> < Pn Aly), (14) 


given y © A(y®). Since for y < A(y®) the vectors y may be written (y® — Xb)/s (see formula 
(2)), formula (14) reduces to 


Bas xP 


os os 


y’ Xb. , 
) < fE-APrs < stacy, 8, < BAC] (15) 
which is equivalent to the set of relations 


b,+£,8s < BIA(y®)] (v=1, or 


(16) 
as < S[A(y°)]. 


The relations (16) define simultaneous confidence sets for the parameters and are satisfied 
with conditional probability p. 

Individual confidence relations for the parameter B; say are obtained by taking SLA(y°)] 
as the range (0,00) and B,[A(y°)] as the range (— 00, 00) for all v + 7. The set B;[A(y°)] must 
be defined to give the integral of the fiducial function over the region 


6; < By{A(y*)] 


the required value say p. Individual cor.“ence sets for o are found in a similar manner. 
If for each possible space A(y) confidence relations are constructed which hold with con- 
ditional probability p then the unconditional probability is also p. 














of 


es 


th 


> & 





)) 
st 





ET 


The estimation of regression parameters 129 


THE ESTIMATOR PROPERTY 


Any two sets of parameter values (8,,0,) and (B,, 02) are related by a transformation 


of the t; 
adi B, = AB, +A, 
} (17) 
C2 = Ao; 
for some A > 0 and A. Ifin y, = X8,+0,u 


the parameter values (B,,0,) are replaced by (B,, 0,2), then the distribution of y, may be 
obtained from that of y, by the transformation 


Yo = A(y, + XA). (18) 


A statistic is said to have the estimator property when it mimics the parameter it is to 
estimate, in that, just as the change (17) in parameters leads to the transformation (18), 
estimates for the parameters corresponding to y, and y, are related by 


6. = AB, se 


19 
@, = Ad. (19) 
the analogue of (17).¢ More explicitly (19) may be written 
Bly + XA)) = AB(y)+ad, (v= 1,...57), 
G(A[y + XA]) = AG(y). (20) 


for all A > 0 and all A. 

The definition of the estimator property is due to Pitman (1938). It is introduced here 
in aslightly different manner and for a more general situation. As in the case of location and 
scale parameters the concept defines a class of statistics readily dealt with in terms of 
fiducial functions. Commonly used statistics such as the least-squares estimators and the 
maximum-likelihood estimators have the estimator property. See examples 1 and 2 below. 


Example 1. The least-squares estimators 6° and the associated estimator ° have the 
estimator property. { 
The least-squares estimators B° for the regression parameters B are given by 
6° = (X':X)*'X’y. 
Replacing y by A(y + XA) gives 


(X’X)-1 X’(Ay + AXA) = AB7+AA 


as required by (20). Further, supposing all w; have unit variance, the familiar estimator 
&° for the error-scale parameter is 


g2 = i (y- = 


wT 





Replacing y by A(y + XA) gives AG" as required by (20). 


+ All estimators in this paper will carry the circumflex sign « . 
} All least-squares estimators are labelled with a square 0 superscrip’. 


9 Biom. 48 








130 A. M. W. VERHAGEN 


Example 2. The maximum-likelihood estimators for B and o have the estimator property. 
The maximum-likelihood estimators corresponding to y simultaneously satisfy 











ag F (“= *) ot wet, de mm 
oF >") tial 


The maximum-likelihood estimators corresponding to Ay + AXA simultaneously satisfy 


sph) =0 (v=].,...,r), 











op, 
0 »(y—X{(B/A)—A} " 
oF ( a ) oe 
The equations (22) may also be written as 
_ 9 sop fy—X{(8/A)—4)\ - 
xaan=ay? (aa) 8 Oa Both - 
y y—X{(B/A)—A}\ _ | 
re? ( aA ) — 
The solutions of (21) for /,,...,8, and o are related to the solutions of (23) by (20) as 


required. 


Example 3. Fixed quantile points of the marginal fiducial functions corresponding to 
vectors y < A(y°) have the estimator property. 

Any vector y < A(y°) may be written as A(y®+ XA) for some A > 0 and A. The fiducial 
function corresponding to y is 

y°—X{(b/A)—A}\ 1 
K(w) F ; 
oF (oa 

It follows that the marginal quantile points transform according to (20), i.e. they have 

the estimator property. 





Remark 1. If two statistics for the same parameter both have the estimator property for 
all y < A(y°) then their difference does not change sign. This follows from (20) by sub- 
tracting two such estimators. 

Remark 2. If for different spaces A(y) a statistic is defined by a different marginal 
quantile point it will still have the estimator property. The estimator property only relates 
estimators corresponding to different vectors y within the same space A(y). 


Closest estimators 
Tf for all estimators 6 in a class, the estimator 0¢ is such that 


\a°—0| < |A-9| 


with probability > }, 4 will be called a closest estimator for 0 in the class.+ 


Proposition. The median 6 of the marginal fiducial function for that parameter is a 
closest estimator. 


{ All closest estimators in this paper will be labelled with a superscript c. 














ty. 


21) 


22) 


23) 


ve 


or 
b- 


al 
eS 








SEES 





The estimation of regression parameters 131 


If 6¢ stands for the median of the marginal fiducial function for 0, then the random 
interval (—0o, 6°) has probability 0-5 of containing the true parameter value 0. Estimators 
are either smaller or larger than & for all y < A(y°) (Remark 1 in the previous section). 
Those which are larger will be more distant from the true parameter value 0 at least when 
the interval (—00, 6°) contains @, i.e. in at least 50 % of cases. The reasoning for estimators 
smaller than 0° is the same when instead the interval (8°, 00) is used. 


Example 4. Closest estimators of the regression and error-scale parameters, when the 
errors are normal and independently distributed with zero means and unknown equal 
variances. 

The joint distribution of y,, ..., y,, readily reduces to 

* 1 [(n—r) (6°)? —B°\’_, — 6° 
) exp =e [e=2 lll + (FF) xX’X (FF) dy,, ..-,dYn; 


o2 





1 
J2ne 
where @° is the vector of least-squares estimators for B and 6" is the associated estimator 
for o (see example | of the preceding section). 





The fiducial function for the parameters /,, ..., £,, 7 is 
: (n—r)(6°)*_(@°—b)’X"X(8°—b) 
K(W) exp ie 2 ee ere “Og2 een 


The closest estimator f° of £, equals /°, the mean of the symmetrical marginal fiducial 
function for £, (v = 1,...,r). Integrating out f,,...,£, from the fiducial function gives the 
marginal fiducial function for o and its median 6 is defined by 


o 1 
[Eee {- 


The transformation = }(n—r) (@°)?/s® gives 


(n—r) (6°)? 
22 








|ds = $. 


rr) —§ 
ee ey |e 
} £=}(n—r) eer P{3(n—r)} =? 


The lower limit of integration is the median of a gamma distribution '{}(n—r)}. Pitman 
(1937) has shown that the median of a '{}(n —1)} distribution is approximately {}(n —1r) — 3}. 


It follows n—r\ (6° 7 — a 
“SyWy Lay fe 


The closest estimator 6 is slightly larger than 6°. 





Least-mean-square-error estimators 


The estimator #” which makes the expected value E(}”—,)? of (fm — £,)? a minimum 
will be called the least-mean-square-error estimator.t For y < A(y°) the transformation 
used in deriving the fiducial function (10) may be written as (6). With the aid of (6) and the 
estimator property (20), (/,(y) —,)? reduces to 


7 (Bly) —0,) 


+ Least-mean-square-error estimators will be labelled with a superscript m. 





132 A. M. W. VERHAGEN 
Hence E(f, — £,)? may be written as 


a [ i) Am) | i) | (2—*)*F(e) db, 1. ddyds] Pe ae 





This integral will be a minimum if the part in the square brackets, i.e. 


B, = b, P 
A) 
is a minimum. (f is mnemonic for fiducial.) Thus £7" is the solution of 


ans) -° 


m — Exlb,/8*) 
pr = Ba)" 








this gives 


Similarly the least-mean-square-error estimator 6” of o is the one which makes 
E(6"-—o/)? 


a minimum. As before it follows Z,{(@”—s)/s}? must be a minimum. By putting 





_ E,{1/s) 


it follows that om = : 
E,(1/s?) 





Example 5. For the case of normal independently distributed errors it turns out 


fr= Bo (v=l....,7), 


aman [M—rlR(n—r+1)} 2, 
6" = 6 a 5 Tiin—r+ay ? as noo. 


Thanks are due to Dr E. A. Cornish, Dr G. H. Jowett and a referee for helpful comments 
on the manuscript. 





REFERENCES 


Prrman, E. J. G. (1937). The ‘closest’ estimators of statistical parameters. Proc. Camb. Phil. Soc. 
33, 212-22. 

Pitman, E. J. G. (1938). The estimation of the location and scale parameters of a continuous popu- 
lation of any given form. Biometrika, 30, 391-421. 

Pitman, E. J. G. (1957). Statistics and science. J. Amer. Stat. Assn. 52, 322-30. 








Pi 








Biometrika (1961), 48, 1 and 2, p. 133 133 
Printed in Great Britain 


Latent vectors of random symmetric matricest 


By C. L. MALLOWS 
University College, London} 


SUMMARY 


Matrices of random variables that are symmetric but not necessarily positive definite can 
arise in various ways. Assuming normality and a certain ‘invariant’ covariance structure, or 
alternatively assuming a Wishart structure, we derive exact likelihood ratio tests of certain 
hypotheses on the latent vectors. In the former case, these lead to exact confidence regions 
for the unknown vectors; approximations to these regions are given. The approximate 
validity of the regions in non-normal and non-invariant cases (for example, in the case of 
rounding-off errors) is investigated. An application to the analysis of a quadratic response 
surface is described. 

1. INTRODUCTION 

We consider square symmetric matrices, not necessarily positive definite, whose elements 
are random variables. Such matrices can arise in several ways, for example: 

(i) in numerical work where the elements of a matrix are computed only to a certain 
number of decimals, the rounding-off errors may be regarded as being independent uniform 
random variables; 

(ii) more generally, the cumulative effect of rounding-off errors at various stages of a 
computation may lead to errors in a final matrix that are individually ‘between’ rectangular 
and normal, and which in general are not independent; 

(iii) in the fitting of multifactor quadratic response surfaces one obtains a matrix of 
second-order regression coefficients, the elements of which are linear in the observations; 
these elements may be regarded as being normally distributed, and their covariance structure 
depends on the design of the experiment; 

(iv) many multivariate problems lead to analyses in terms of sample covariance 
matrices, which are necessarily positive definite, and may be assumed to have the Wishart 
distribution. 

In §2 we state various sets of precise assumptions regarding the distribution of the 
elements of the random matrix. Various hypotheses regarding the latent vectors are given 
in §3, and the corresponding test statistics are derived in §4. In § 5, the distributions of the 
normal-theory statistics are investigated in non-normal situations. Confidence regions are 
derived in §6, expressions in statistical differentials are used to derive some refined approxi- 
mations in §7, and § 8 treats a numerical example from two points of view. 


2. ASSUMPTIONS 
We shall assume throughout that the observed (k x k) matrix X has the structure 
X = £+Z, 
+ This paper is based partly on Technical Report No. 35 of the Statistical Techniques Research 
Group of Princeton University, which was written with the support of the Office of Ordnance Research, 


U.S. Army while the author was on leave of absence from University College, London. 
+t Now at Columbia University and Bell Telephone Laboratories. 








134 C. L. MALtLows 


where & is a fixed symmetric matrix, and Z is a symmetric matrix of random variables with 


zero means. Thus &(Z) =O, &(X) =. 
We shall write ==TYTaAr’7, X=GLG’, H=G’T, 


where I and G are orthogonal matrices whose columns form normalized latent vectors of 
= and X, respectively, and where A and L are diagonal 


A = diag (Ay, Ag, ...,A,), L = diag (I,, 1, ...,1,). 
When necessary for definiteness, we shall assume that 
r,, > 0, Gi, > 0, A, >A, > ooe >A, L>l> coe > I,. (2-1) 


Various assumptions may be appropriate regarding the covariance structure and distribu- 
tional properties of the elements of Z. Here we shall assume that the covariance structure is 
known up to a constant factor, so that if we write 


2 
cov (Z;;; Zim) “xn Vij,tmO 


then all the V’s are known. If the Z’s are rounding-off errors, then a? will be known also; 
otherwise we may require an independent estimate of a”. We shall often assume that the 
elements of Z are either (a) multinormal or (b) uniform. 

The following sets of more detailed assumptions will lead to especially simple results. 


(i) The invariant covariance structure 
It is shown in Mallows (1959) (referred to hereafter as T.R.35) that a necessary and 
sufficient condition for the covariance structure of a random symmetric matrix to be 
invariant under all transformations of the form Z’ = PZP* (with P orthogonal) is 


Cov (Z;;, Zim) = (V(b Sm + Sim Op) + 6845 Sn} (2-2) 
for some o?, v © Thus in particular 
var (Z,;) = vo (t+), cov (Z;,;,2;;) =co® (t+J), var (Z;;) = (2v+c) 0°, 


and all other covariances vanish. 
For c > 0, such a matrix Z can be represented as 


Z = wiI,+W+W’, (2-3) 
where w and all the elements of W are uncorrelated variables, with 
var (w) =co*, var (W,;) = 4vo?. 


This structure arises naturally in a certain response-surface problem (see § 8). Here, the 
elements of X are linear in the observations, and it is reasonable to assume that they are 
normal; neither = nor X is necessarily positive definite. v and c are functions of the experi- 
menta! design, and depend on the moments of the design up to the fourth order. 

If it is assumed that the observations are not normal, the means and covariances of the 
elements of X are unaffected; their fourth moments are effected by the moments of the 
design up to the eighth order. We have not attempted to investigate this effect. 














nm 


th 


of 


) 


\w 








Latent vectors of random symmetric matrices 135 


However, there is another reasonably simple non-normal modification which has an 
application to a case involving rounding-off errors. It is obtained by assuming that in the 
representation (2-3) we have c = 0, v = 1, and the elements of W are independent with 


var (W;;) = 307, 3(Wij) = 0, 4(Wij) = ($07)? (3 +- Ky). (2-4) 
This leads to 
var(Z,;)= 0? (t+),  — Ma(Zij) = O*(3+3Kq) (1+)), 
=2o* (i=}), = 404(3+K,) (t=), (2-5) 
bo9(Zi3, Zim) = var (Z;;) var (Zim) (t,9) oe (J, m) or (m, l). 
with all odd moments (up to the fourth order) zero. If the elements of W are independent 
rounding-off errors, then x, = — &. 


(ii) The Wishart covariance structure 

Here the covariance structure of the elements of X is the same as that of a matrix of 
sample variances and covariances in a sample of n+1 from a k-variate population with 
covariance matrix ©, and zero third and fourth cumulants. We have 

E(Xyj) = Ey, } 
cov (X45, Xim) = 2 (Sg 3 jm + BimEq)- 
If C is a matrix such that CEC” = I, then the matrix 
Y = CXC’ -I, 
has zero expectation and has the ‘Unit Wishart’ covariance structure, namely 
COV (Yi5, Yim) = 2*(Sin 85m + Fim On)- 

This is a special case of the invariant structure (2-2). 

If the underlying k-variate population is multinormal, we shall as usual say that X has the 
Wishart distribution with n d.f., based on the covariance matrix ©; in this case X must be 


positive definite. If the higher cumulants of the k-variate population are not zero, &(X;;) is 
unchanged, but now 


(2-6) 


cov (X;;, Xim) = n“(Sy Fim + Bam oF) + (n + 1)" Kijim: 


(iii) The rounding-off structure 
Here we assume that the elements of X are rounded off directly; so that the diagonal and 
super-diagonal elements of Z are independent uniform variables, with 


var (Zj;) = 07, f3(Z;j) = 9,  fa(Zjj) = O4(3 +4) (2-7) 
where kK, = —. 
3. HYPOTHESES CONSIDERED 
Consider the following hypothesis regarding latent vectors (l.v.’s) of E 


H,(T,,): A specified set of p( < k) orthogonal unit vectors y,,¥2, ..., Yp forming the columns 
of a (kx p) matrix I’, are l.v.’s of &, i.e. 


Sy;:=y¥:A; for some A; (i=1,...,p), 
or equivalently (I,-T,,T7) =F, = 0. (3-1) 





136 C. L. MatLows 
This hypothesis imposes d, = p(k—1)—4p(p—1) 


independent linear constraints on the elements of &; it is therefore a linear hypothesis of this 
order. H,(I,,) is the intersection of the hypotheses H,(y,), ...,4,(y,); H,(T,) asserts that 
r7=Tr,, is diagonal. 

Notice that the order of the columns of I, is not important; we are not here imposing the 
restriction that the corresponding latent roots (I.r.’s) are ordered. 

Although H,(T,) is a linear hypothesis, the elements of I’, are not linear parameters. The 
matrix © can be specified by any set of d, = $k(k+ 1) functionally independent elements, for 
example, by the elements in and above the diagonal. Thus can be represented by a point & 
in a dy-dimensional space Q. H,(I',) asserts that § lies in a linear subspace of d,, = dy —d, 
dimensions. However (if k > 3), for any one pair (2,7), if T’;; is held fixed, the locus of points 
€ corresponding to matrices © satisfying (3-1) for some values of the remaining elements of 
I, is not a linear subspace. This implies that the least-squares estimates of the elements of T,, 
are not linear in the elements of X; and there do not exist d, linear functions of the elements 
of X which are sufficient to determine these estimates. This does not affect the possibility of 
constructing exact confidence regions for the elements of T’,, (compare Beale (1960), especially 
pp. 44-5). 

We can consider also the linear hypothesis H,(T,,; A,,) which specifies p 1.v.’s of = and also 
the associated Lr.’s; and the linear conditional hypothesis H,(A,|T,,) which specifies the 
L.r.’s conditionally on the l.v.’s. These hypotheses impose respectively p+d, and p con- 
straints on the elements of =. H,,(T,; A;,.) specifies = completely. 

There are many possibilities in the construction of systems of nested linear hypotheses; to 
preserve the linearity of the set-up, the hypotheses introduced at each stage must never 
specify a l.r. before the corresponding l.v. 


4. DERIVATION OF TEST CRITERIA 
4.1. Normal invariant case. We derive the likelihood ratio tests of the hypotheses 
described in the previous section, relative to the alternative hypothesis that = is completely 
arbitrary. When the elements of Z are multinormal with zero means and with the invariant 
covariance structure (2-2), the log likelihood is 
L(=) = const. — $4(k + 1) no? —48(&), (4:1) 


c 
2v+ke 





where S(2) = — (+ (X—=)2— [tr (X— =)p) 


Treating a? as known for the present, the unrestricted maximum is obtained by setting 
= X, and we have min S() = S(X) = 0. 


A 
= 
= 


In maximizing L(&) subject to the p +d, constraints on = imposed by H,(I,; A,), it would 
seem that we should include terms containing just this number of Lagrange multipliers. 
However, we shall proceed by including a full (k x p) matrix 6 of multipliers, one for each of 
the (non-independent) constraints in 


sV, = T,A,. 
The resulting equations will not be sufficient to determine all the Lagrange multipliers 
uniquely, but will be sufficient to determine the maximum of L(&). Thus we minimize 


S(=) + tr (07(ET, -T,,A,)) 














obt 





\w 


vs 








Latent vectors of random symmetric matrices 137 


obtaining after some algebra the test statistic 
p(T; Ay) = min » S(Z) 
1 : : a 
= Fyo8 2 tr (C7 X(1,,—T,T)) XT) + tr (07 XT, - A,)*} 
1 A 
+ Qvo2 {ex (A, = A,}*- 
= y,(T,)+¥,(A,|T,) 
where A, (p x p) contains the diagonal elements of T')’XT,,, with zeros elsewhere. (Van der 
Vaart (1958) remarks that A, is an unbiased estimator of A, while the elements of L do not 
give unbiased estimates.) (0, ; A,,) is distributed as x? ‘with pt+d, df., and with non- 
centrality obtained from (4-2) by nepanaling X by = throughout. The non-centrality is zero if 
H,(T,,; A,) is true. 
For H,(T,), we now minimize (4-2) with respect to A,. Hence the statistic y,(I',) for 
testing H,(T,,) is given by the first two terms of (4-2), and y,(A,|T,,) by the last two; the 


degrees of freedom are d,, and p, respectively. 
Putting p = 1, and writing y, A for T,, A, we find 





sepeltA,—a,re} (42) 


Wily; A) = aly” X*y — | ¥ Xy)} + | (y? Xy— A)? (4-3) 


= Wily) + y(Aly). 


Putting p = k, the first term in (4-2) vanishes, and y;(I,) can be written in the forms 


1 
(2v+0¢) 0? 


1 ‘ 
vy A<T;) = vo 2 (TE XT,.);;} 


1 


” Snot  X3,— X(T FE XT ;,) 4)" (4-4) 
ij i 
= + Wily) 
where 7,(y;) is ¥,(y) computed for the ith column of T,. Also 
Wy(Ay|T) = 18 (A, — Ay) VA, — A,) 1, (4:5) 


where 1, is a (k x 1) column vector of ones, so that A, 1, is a column vector formed from the 
diagonal elements of A,; and V is the covariance matrix of the diagonal elements of X, i.e. 


V = 2vo*l,, +co71,,1f, 
V-1 = (2vo?)-! (I, —c(2v + ke)-1 1,17). 


If o? is unknown but an independent estimate s* is available which is distributed as 
o*y%/f, the obvious modifications have to be made to the above derivations. The likelihood 
ratio criteria are now monotone functions of the statistics, obtained from the ~’s given above 
by replacing o? by s? throughout; the distribution of the modified y,(T,,), for example, 
becomes that of F with d, and f degrees of freedom. When dealing with a conditional 
hypothesis, an estimate of ‘0? becomes available from X itself. Thus to test the hypothesis 
H,(A|y) we may compare 


(kK—-1)¥(Aly) _ (k-1)v__(y*Xy—A?? 


Wily) Qv+e y?X*y—(y7Xy)?’ 











138 C. L. MaLLows 
with the F distribution on 1, k—1d.f. The procedure can be presented as an analysis of 
variance. 
4.2. General nested hypotheses. Suppose the hypothesis H specifies p = p,+ pz |.v.’s and 
p, of the corresponding |.r.’s. We now need to minimize S(=) subject to the constraints 
Sr, =T,A,, ST, =T,A,, (4-6) 


where I',(k x p,), T,(k x p,) and A,((p, x p,) diagonal) are specified by the hypothesis, but 
A,((p2 x Pz) diagonal) is arbitrary. Minimizing first with A, held fixed, we reach the right- 
hand side of (4-2) with 


A, 0\ « (A, 0 
F,-O.T) 4=(f a) 4=(¢ a): 


As before, A, consists of the diagonal elements of Px. Hence the minimum when A, is 
unspecified is obtained by setting A, = A,, and 


min, S(=) = Vos AT T,)}+ Wy, (Ay |). 


If H,, H,, ..., H, is a nested sequence of hypotheses of the above type, where H, specifies = 
completely and H, leaves = completely arbitrary, then we have an analysis of variance of the 
form 


Source SS. d.f. 
About H, within H, S(E,) —S(E,) d,—d, 
About H, within H, S(E,) —S(=,) d,—d, 
About H,_, S(S,_1) ad. 

Total S(Ep) dy 


where S(=,) = min,,, S(E) so that S(Z,) = 0, and if H, specifies a, l.v.’s and 6, of the corre- 
sponding L.r.’s, then d, = (k-1)a,—4a(a,—1)+5, 


4.3. Wishart case. Suppose x7 = (2,,...,%,) is a vector variate having covariance 
matrix ©. If x,, ..., X,,,, are n+ 1 independent observations, then 
1 +1 i vir | 1] n+1 
X = 55 x,-2)(x,-2)" (k= 5 Ex) 
is the sample covariance matrix, and &(X) =. 
The hypotheses described in §3 have interpretations in terms of correlations between the 
basic variates. If these variates are assumed to be multinormal, the hypotheses refer to 
independence properties. 

When L, is given, let I'y be any (k x (k—p)) matrix such that (I',, Ty) is orthogonal. Then 
the hypothesis H,(I,) specifies that the variates forming the elements of I'7’x are uncorre- 
lated with each other and with the elements of I'7’x; i.e. that they are principal components 
of the basic distribution. H,(I',; A,) specifies the variances of the elements of I'7x also. In 
the normal case, the theory of testing these hypotheses is almost standard (see e.g. 
Anderson (1958) chapters 9 and 10); nX is now distributed in the Wishart form with n d.f., 
based on the covariance matrix &. Hence the likelihood ratio criterion for testing H,(T,) is 
given by : 
~2nA = nin{ Ti (7X7, ),,|T7XT,| xi} (4-7) 











an 


wh 


of 


ad 


, is 


rre- 


nce 











Latent vectors of random symmetric matrices 139 
and that for testing H,(T,; A,) by 


—2IndA = nIn{[A,| |TFXT,| |X|-4} + n tr (7XT, Az") — xp. (4-8) 


The distributions of these criteria do not depend on any nuisance parameters; under the 
null hypothesis, they are asymptotically distributed as x? with respectively d, and p + d, d.f. 
Box’s (1949) 1esults show that a closer approximation for the first of these criteria is 
obtained by taking 6-*(—2In A) to be distributed as y* with d,, d.f., where 

1 


_ cman as = fo “ane as 
o=1+5,(1+%), a, = k*—(k—p)—p. 


However, his method does not apply to the second criterion, which has the same distribution 
as has 


mp In (ne-!) — n In |A,| + tr (A,) + tr (A,), 
where A,(pxp) and A, (px) are independently distributed in the Wishart form on 
respectively n—k+p and k—pd.f., based on the covariance matrix 


z, = A; TEP, A>! 


which reduces to I, when the null hypothesis is true. The characteristic function of this 
criterion is easily found to be 


i [3 /- » I[k(n—k+p+1—j)—nit] 
n 


[I — 2H, Pent) (nk +p + 1-7) 
so that the mean is p+d,+n(tr=, —p—In|Z,|)+O(n-). 


4.4. Rounding-off case. If we follow through the likelihood ratio procedures assuming 
that the elements of Z are distributed uniformly on (—e, +€), as is appropriate when they 
are rounding-off errors, we arrive in each case at test statistics which take only two values, 
one of which is zero; the non-zero value being attained if and only if there exists a matrix = 
satisfying the constraints imposed by the hypothesis tested and such that each of the 
elements of X—& lies in (—e, +¢). The problem of determining whether such a matrix 
exists can be formulated as one of linear programming. 

An alternative approach is to derive test statistics assuming that the elements of Z have 
the correct variances (as in (2-7)) but are normal instead of uniform, and then to find 
approximations to the null-hypothesis distributions of these statistics under the uniformity 
assumption. This leads to very complicated results except when p = 1, in which case we 
obtain the statistics (for details of this derivation see T.R. 35) 


UT (ys A) = YY) + YAAly), 
VE(y) = o-*(y7XDXy — 4°), (4-9) 
YEAly) = o-*8(1 +) (A—Ae 
where D(k x k) is diagonal with D;, = (1—y3)~1, and 
é=y7Dy, 4 =64yTDXy. 


Under the normality assumption, these statistics have y? distributions with respectively 
k, k—1, 1 df. Under the uniformity assumption, their expectations are unchanged, but 
their variances become complicated functions of the elements of y (given explicitly in 





140 C. L. MatLows 


T.R. 35, table 15). We give in Table 1 the values of these variances for the two types of 
vectors 
(i) y(i) = (0,...,0,1,0,...,0)7 (unit in any position), } (4-10) 


(ii) y(ii) = &-4(1, + 1,...,+1) (any allocation of signs). 


While this has not been proved rigorously, it seems likely that these vectors correspond to 
extreme values of the variances. It will be seen that for small k, with x,=—, these 
variances depend fairly strongly on the vector y. The simple assumption that the distribu- 
tions are adequately represented by the normal theory x? distributions is inadequate; 
a closer approximation is obtained by introducing a scale factor (depending on y) and 
modifying the d.f. However, the situation is still unsuited to the construction of confidence 
regions for y or (y; A), since the statistics in (4-9) are complicated functions of y. 


Table 1. Means (M) and variances (V) of the statistics in (4-9) under the null hypothesis, 
assuming the elements of Z to have the rounding-off structure (2-7), for the two vectors in (4-10) 


Statistic M Viy=y(i)) Viy¥=y(ii)) 

es he aiear | 
Wily; A) k (2+K,)k abn 2-5 +a —peEip 

1% age 2 

yi(y) k-1 (2+,)(k—1) 2(k—1) +k, *~5* aa th 

* 8k-—7 ) 
WT(Aly) 1 2+Ky 2+ ropa ae 


We therefore reject this approach, and suggest that the statistics appropriate to the 
invariant structure (given in (4-2), (4:3)) be used for the present case also. Assuming 
uniformity, the null-hypothesis moments of these statistics are easily found (see th: next 
section); and the (approximate) confidence regions obtained using these statistics are much 
simpler to work with than are those obtained using (4-9). 


5. NON-NORMAL MOMENTS OF THE NORMAL INVARIANT STATISTICS 


We give some lower moments of the statistics given in (4-2), (4:3), (4:4), (4:5); first 
(case J), for the non-normal version of the invariant structure (2-5), and secondly (case R), 
for the rounding-off structure (2-7). The statistics for 1 < p < k are complex, and we have 
not calculated the second moments for these cases. 

The moments of these statistics when X has a general Wishart distribution depend on the 
matrix ©; when X has a unit Wishart distribution, we find 


ElYp(Ey)} = dy, varyg(E,) = 2d, +~ (k2— (p+ 1) (2p +1). 


For the rounding-off case with x, = —$, and for each of the two ‘extreme’ vectors in 
(4:10), the means and variances given in Table 3 were evaluated numerically; equating 
these moments to those of 0x7 we arrive at the (approximate) 5 and 1% points given in 
Table 5. It will be seen that the effect of y on the distribution of ~,(y) and y,(y; A) is 
considerable. 

In Table 6 we give approximate 5 and 1 % points of distributions fitted to the moments of 
the statistics ¥,(T,) and w,(T',; A,), for the rounding-off case. The distribution of 











Vi 


ma 
of! 


vy, 
Vi 


vr 
Vr 





3 of 


‘s in 
ting 
on in 


A) is 


ts of 
n of 








Latent vectors of random symmetric matrices 141 


¥,(T,; A,,) does not depend on I,; for y,(I',) we have taken, first, I’, = I, the unit (k x k) 
matrix, and secondly, we have averaged the moments in Table 4 by assuming all orientations 
of T’, to be equally likely (for details see T.R. 35). The effect of I’,, is seen to be considerable. 


Table 2. Means of the statistics in (4-2) for the invariant (I) and rounding-off (R) 
structures ((2-5) and (2-7), respectively); p general 


Statistic M(I) M(R) 
y,(T,) d, d,+43(UI y+ & Ii,T%.— 2p) 
ab abe 
yy (A,|T,) P p-32XT% 
ab 
¥AT,; A,) pt+d, p+d,t+ % DevTae— 2p) 
a 


Table 3. Means (M) and variances (V) of the statistics in (4*3) for the invariant (I) and 
rounding-off (R) structures 


Statistic M(I) V(1) M(R) V(R) 
Wily) k—-1 %(k-1) k-24+2, 2k—64+65,—45,+2D3 
+ 4x,(1+(k—2) 5, — 82, + 823) +k,(1+(k—9) 2, +62, — 72, +823) 
Waly) 1 24K, 24 1-32, &(1—42,)?+«,(224—-F3,) 
WilysA) k 2k+ k-14+42, 2k-4445,-—25,4+423 
4x,(1+(k+ 2) D,—42, + 223) +k,(1+(k—5) 2, +32, —Z2, + 253) 


(N.B. >, = Dy%)- 
a 


Table 4. Means (M) and variances (V) of the statistics in (4-4) and (4-5) for the 
invariant (I) and rounding-off (R) structures; p = k 


Statistic M(1) V1) M(R) V(R) 
yw (T;) d, 2d,(1+ 44) $k(k-—2)+4A k(k—8)+A+4C 
+K,4(C —A) +x,($k? -$k+3A + 2C—4B) 
WAIT) k 2k+K,C k-4A 2k—2A +40 +K,(20 —3B) 
k+3 
wT ss Ay) k+d, 2k +ay)(1 +k 4) yk? k(k— 4) (1+ 344) 


(N.B. A==r,, B=2r, C==r,, rao= D7 Fy A 
a a ab € 


Table 5. ‘Approximate 5 and 1% points of the statistics y,(y) and y,(y; A) in (4-3) in 
the rounding-off case, for the two vectors in (4-10) 





k Vly (i) ¥i(y (ii)) Wily (i)s A) Wr(y (ii); A) 
a_i ‘ c x ~ a - * cr a = 
2 2:77 4:13 1-71 2-81 3-43 4-73 3-20 4-61 
3 4-43 6:03 3°70 5-50 5-04 6-62 5-02 6-96 
4 5-92 7°72 5-48 7°76 6-50 8-29 6-79 9-18 
5 7-32 9-28 7-18 9-83 7-90 9-85 8-48 11-25 
6 8-68 10-78 8-80 11-78 9-25 11-35 10-09 13-20 
| 10-00 12-23 10°37 13-63 10-56 12-80 11-66 15-04 
8 11-29 13-64 11-89 15-41 11-85 14-21 13-17 16-81 





142 C. L. MALtLows 


Table 6. Approximate 5 and 1% points of the statistics y,,(T;,; A;,) and w,(T;,) in the 
rounding-off case 











k YT ,; Ax) w,(1,) AT), average 

ft —-“ —* al A. ‘Y a A— ‘Y 
2 4-82 6-78 2-77 4-13 2-43 3°88 
3 8-81 11-45 5-92 7°72 5°34 7-29 
4 13-75 17-01 10-00 12-23 9-13 11-56 
5 19-66 23-53 15-06 17-73 13-85 16-74 
6 26-57 31-02 21-11 24-22 19-52 22-86 
q 34:46 39-51 28-16 31-70 26-16 29-94 
8 43-35 48-98 36°21 40-17 33-78 38-01 


6. CONFIDENCE REGIONS 


6.1. Normal case. Having derived test criteria for the hypotheses of § 3, we can construct 
confidence regions for the p latent vectors comprising I’,, or for the latent vectors and roots 
jointly. Thus, for example, assuming normality and the invariant covariance structure we 


h 
sol Piip(Tp) < X4,(a)} = % 

where y7(«) denotes the 100«°% point of the x? distribution on f d.f. Hence the region in 
arameter space 

, 4 Ty: ¥p(Fp) < Ki} (6-1) 


is a confidence region with confidence coefficient « if K, = x3 (a). If o? is unknown, we 
replace it by its estimate s* and take K, = d, Fy, ,(«), where f is the d.f. of s*. 


6-2. Thecase p = 1. When p = 1, the (unit) vector y can be represented as a point on the 
surface of the unit hypersphere (in k dimensions). If we impose the restriction (2-1), y can 
vary only over half the surface of this hypersphere; to avoid non-essential complexities we 
shall relax this part of (2-1), and shall identify y with a pair of points on the hypersphere, at 
opposite ends of a diameter. A confidence region for y will then consist of a set of diametral 
point-pairs. 

For almost all observed matrices X, all the latent roots 1,, ...,1,, of X are distinct, and the 
corresponding latent vectors §,, ..., 8, (columns of G) are uniquely defined (either as points, 
subject to (2-1), or, as we shall assume, as diametral pairs of points on the unit hypersphere). 

The 100«°% confidence region for y derived from y,(y) is 


Ry = {y: y?X*y —(y7Xy)? < K3}, (6-2) 


where K, is va" yj,_, (x) if o?is known, or (k — 1) vs*F,_, ; («) if an estimate sis used. For « not 
too near 1, Ry consists of k cap-pairs Ry; (j=1,...,£) on the unit hypersphere, where Ry; 
contains g;. As «(or vo) increases, the caps get larger, until they begin to coalesce; for 
a sufficiently large, Ry contains the whole surface of the hypersphere. 

This last phenomenon is unattractive; in §6.5 we discuss the possibility of constructing 
a system of confidence regions which do not suffer from this defect. 

To find the shape of the cap-pair Ry,;, write 


h=G’y, y=Gh, XG=GL, L = diag (I,,],,...,1,). 
Then the region Ry becomes 


Ry = fy: > A(l,-4)?-[ S AU,-1L))2 < K.} (j=lor2or...ork). (6-3) 
i(+j) i( +3) 








| 
I 





co 


6p 





(6-3) 








Latent vectors of random symmetric matrices 143 


If 1, is different from all the other l.r.’s (as it will be with probability 1), then for small values 
of K, the region Ry; is approximately 


Ry; = Rx; = {y: >> hi(l,—,)? < K;} 
i(+3) 


= ty: y7(X-L1,)’y < K3} (6-4) 
so that the boundary of Ry; is given approximately by the intersection of the unit hyper- 
sphere with a certain hyperelliptic cylinder, having semi-axes 


84 = K}\-L|> (i+). 
The approximation involved in obtaining (6-4) will be adequate provided all these semi-axes 
are small compared with unity (the neglected terms in (6-4) are of order (s?,) relative to those 
retained). A second approximation to the confidence coefficient associated with the region 
Ry; in (6-4) is derived in §7. 

6-3. Discussion. The shape of the confidence region Ry is a novel feature of this problem, 
and merits some discussion. The statistic y,(y) is a function of the matrix X; let us write it 
temporarily as y(y, X). We have shown that if y is a l.v. of &, say y = y,, then y(y;, X) is 
distributed as yj_,. This will be so for each of the l.v.’s; so for each j separately, the 
probability that the region Ry = {y: Wly,.X) < K} 


contains y; is the announced confidence coefficient « (if K, = x%_1(«)). Here, we can label the 
Lv.’s of E in any way we choose, independent of X; thus for example each of the following 
statements will be true with probability exactly a: 

(i) Ry contains the l.v. of = corresponding to the largest L-r.; 

(ii) Ry contains the l.v. making the smallest angle with the vector (1,0, ..., 0)”. 

The following questions now arise. Assuming that vo?K, is sufficiently small, so that Ry 
consists of k cap-pairs, can we assert (with exactly specified confidence) that 

(iii) of these k cap-pairs, a particular one (say that one containing the l.v. of X corre- 
sponding to the largest l.r. of X) contains a l.v. of &; in particular, that 

(iv) this cap-pair contains the l.v. of = corresponding to the largest L.r. of E; or that 

(v) each of the k cap-pairs contains a l.v. of . 

We can do none of these exactly, since it is never with probability 1 that Ry splits up into 
k cap-pairs; but, to a good approximation, if the latent roots of = are well separated, (iii) and 
(iv) are approximate confidence statements with coefficient « (see § 7). With exact confidence 
we can assert a statement closely similar to (v), as follows: 

(vi) Ry contains all the l.v.’s of &. The probability of this event is 


P{(yy,X) < Ky, ..., Wyn, X) < Ky} = P(k, Ky), say, 
which is not «*, since the different ~’s are not independent; in fact they have the joint 
structure wh(y;,X) si wis (j = 1,2, ..., 8), 
where W = [7ZI is a symmetric matrix whose superdiagonal elements are independent 


unit normal variables. The explicit determination of the function P(k, K,) is difficult; 
however, heuristic reasoning suggests that the approximations 


P(k, K,) = exp{—k(1—«a)} ~ 1—k(1—«) 


(where K, = xz_,(«)) will be adequate over the usual range of values of P. 





144 C. L. MaLLows 


6.4. Other cases. The fact that we can thus derive from the statistic y,(y) confidence 
statements involving all the l.v.’s of = simultaneously, makes discussion of the regions 
obtained from the statistics y,(I',,) for 1 < p < k somewhat superfluous. The regions in the 
cases 1 < p < k are in any event not easily visualized; in the case p = k each of the three 
forms for y,(T',) given in (4-4) is useful in appreciating some aspects of the resulting region. 

The joint confidence region for (y; A) derived from the statistic ,(y; A) of (4-3) can be 
described as follows. The vector y lies in a region of the same shape as Ry (6-2); for each such 
y, the corresponding A lies in an interval centered on A = y7Xy, with length 


2d = 2{(2v +c) o*(K,—yy(y))}. (6-5) 


The joint confidence coefficient will be a if K, is x7(a). Again, we omit discussion of the cases 
l<p<k. 

6.5. Further discussion. We now consider the possibility of constructing a system of 
confidence or Bayes regions which will not suffer from the defect mentioned above with 
respect to the system Ry; we search for regions which for all X will leave part of the hyper- 
sphere uncovered, and which preferably will always consist of k separate cap-pairs. This 
discussion is somewhat academic, since the region Ry only fails to give k distinct cap-pairs 
when the latent roots of X are so close together (relative to vo”) that the corresponding 
l.v.’s are ‘almost’ indeterminate (for practical purposes). In this situation, the merging of 
several caps into a connected region can be regarded as a desirable property of the system 
Ry, since it indicates that several l.r.’s of = may be close together (or even equal). 

When o? is known, the log-likelihood (4-1) is fully determined by S(E); but = depends not 
only on y but also on the other |.v.’s and on the |.r.’s, which are nuisance parameters. Thus 
we cannot apply an inverse probability argument to y directly. Eliminating the nuisance 
parameters by minimizing S(=) with respect to them, we have 


min S(&) = ¥(y) 


so that the contours y,(y) = const. are contours of constant maximized likelihood. ,(y) is 
a minimum (zero) when y is any l.v. of X; y,(y) isa maximum ((, —1,)?/4v0? = m, say) when 
¥ = (61+ 8,)/V2. 

In the confidence region argument used above, the bound K, in (6-1) depends on k and a 
only, and so is unrelated to the range of variation (0,m) of y,(y). A procedure taking 
account of the value of m (for example, one using the region (6-2) with K, = K*m, K* being 
a function of va?, k, ~) can be given exact confidence properties only if the distribution of 
m is known, and this depends on the L.r.’s of =. Similarly if instead of m some other function 
of the l.r.’s of X is used. The author has been unable to construct a system of exact confidence 
regions with the desired properties. (This is not to say that such a system cannot exist, or 
that a very good approximate system cannot be found.) 


6.6. Wishart case. The confidence region Ry obtained by setting a bound K, on the 
statistic in (4:7) (with p = 1) is not readily described exactly; for K, not too large, it is 
composed of k cap-pairs on the unit hypersphere, of which a typical one is (to a first 


approximation) Re: ~ {y: > h3(1,—1,)?/L,l; < K} (y=Gh). 
i(+j) 


For approximate confidence a, K, can be taken to be 


K, = {1+ (4n)-1 (3k + 4)} x3_1(@). 


























Latent vectors of random symmetric matrices 145 
Thus the boundary of Ry; is approximately elliptical, with semi-axes 
834 = (Ky),l,)t |,-L,|4 +9). 
Similarly, the statistic (4-8) (with p = 1) gives rise to a joint confidence region for (y; A). 
y lies in a region of the same shape as Ry above; and for each such y, A lies in an interval 





given by yTXy A y?Xy |? xr,| 

n(T 14 mag) < K,—nin (& x): 
To the same degree of approximation as above, when y is in the jth cap Ryy;, this interval is 
given by n(A—L,)? < 22(K,—n = Till —1,)?/I;l;). 


For approximate confidence «, K; can be jis = be x3(a). 


6.7. Rounding-off case. Here we shall obtain approximate confidence regions by setting 
bounds on the statistics in (4-3), which were derived originally for the normal invariant 
case; the values of the bounds will be obtained from the approximate distributions of these 
statistics in the rounding-off case, as characterized by the moments given in Table 3. Thus 
the confidence region will consist of k cap-pairs on the unit hypersphere; but in general the 
relative sizes of the caps will not be the same as in the normal case. 

Let the l.v.’s of E be y,,...,y;,- Then the means and variances of the statistics ~,(y;) 
(j = 1,...,4), namely (M,,V;) (j=1,...,4), are given in Table 3 (with Z, here replaced by 
Xj = Lj (j = 1,...,k)). We take y,(y;) to be distributed approximately as 4; x7, where 


M; = 6;f;, V; = 26f;. In practice, we shall approximate these quantities by replacing y,; by 
the corresponding l.v. PT of X. Thus finally, the part of the desired approximate confidence 
region containing §; is = {y: = wll -L)< K} (y=Gh) (6-6) 


where for confidence «, we take K; as 0%, x7,(«). The discussion in section 6.3 applies here. 


7. FURTHER APPROXIMATIONS 


Lawley (1956) gave expansions in statistical differentials for the Lr.’s of a random sym- 
metric matrix. The method extends trivially to give expressions for the l.v.’s also. We write 


as above X=E+Z, EF=LaA, XG=GL, 
r7XT=A+W, W=I’ZT, H=G’T, 
where w,; is O(€), hy; is 1+ O(e?), h,; is O(e) for i + j. Then from the diagonal and off-diagonal 





elements of the equation HA+HW = LH 
we obtain hill; —Az;— Wu) = x hia -” 
fa (7-1) 
h(i; — —A; j) — hywyt > fia aj (2 +)); 
we; 
whence L, = Aytwyt = a is + O(e%), 
Ay Ay) has = wy t YD FA —- TERY 4 O68) (i+ 5),) (7-2) 





a+i i, ct Ay Aj 





w2 
hy =1-4E Qa pt Oe) 


a+i 


and by successive substitution in (7-1) the expansions can be taker: as far as is desired. 


10 Biom. 48 











146 


These expansions will now be used to obtain a closer approximation to the confidence 
coefficient associated with the use of the region given (exactly) by the right-hand side of 
(6-4), in the normal invariant case. Then W has the same structure as has Z itself. Sub- 
stituting from (7-2), we have 

»y hi (1; —l; “42 = a wi; + (1; —A;—w;;)? 


i+j 


C. L. MatLtows 





= Dwi ei wi J+ 046% 
ing eg Ay : 
Hence to this order, 


P{RX,; contains y,;} = P{x;+vo*y,; < K,} = P; say, 
where 


kK, - Xi-1(@), x; = % Uis> Y; = ( X eyuis)*, cj = =A,- A;; Uj _ (vo? i tw,;, 
t+j i+j 


so that the k—1 variables {u,,} (¢+j) are independent unit normal. Following the argument 
of Beale (1960) (pp. 65-66), and writing p,(K) for the probability density function of Xe 
evaluated at K, we have to the same order as before 


P, = &—p,_3(Ky) &(vo*y,|x; = Ky) 
, vor Kk? 
=e Pr- (Ky) 2-1 1 C; 
> re 


where = (> ¢,;)?+2 2 ci; (7-3) 
i+j 


Alternatively, to restore this probability (approximately) to the value a, we must replace 
Se nee ay Kt, = K,(1+K,v0°O(k2— 1). (7-4) 


In practice, we replace the c’s by their estimates in terms of the I.r.’s of X. Thus finally, an 
approximate 100 °/, confidence region for y consists of k cap-pairs on the unit hypersphere, 


the jth cap being RY = fy: y7(X—UI,)*y < KYv0%}, (7-5) 
where Kj, is given by (7-3) and (7-4), with 

Ky = Xiula), Gy =4-h. 
This cap is the intersection of the hypersphere with an elliptical hypercylinder, having 
semi-axes (Kt, “ votc?, jE (i+j) 


8, = 


The discussion in § 6.3 applies here. 

If o? is unknown but an independent estimate s? is available which is distributed as o7y3/f, 
a similar argument shows that the jth cap of an approximate 100a % confidence region is 
given by (7-5), where now Kj,vo" is replaced by (k— 1) vs*F'7,, where 


(k—1)ve*F f+k—-1 a) 


7 (1+ B-1 f+(k-1)F (7"6) 


and F = F,_, ,(«). 











of 


at 
2 


Xr 


ag 





oe Re 


Latent vectors of random symmetric matrices 147 


8. EXAMPLES 


The matrix 16 16 1-2 
X=1[16 03 06 
12 06 0-7 


was obtained by rounding off to one decimal a matrix the elements of which were approxi- 
mately normal with variances of the order of (0-35)? (actually the matrix (24-3) of T.R. 35). 
We shall use this matrix to give examples of the application of the methods of this paper 
both for the normal case and for the rounding-off case. 

The latent roots and vectors of X are found to be given by 


3°3475 0 0 0-7512 0-2688 0-6029 
L= 0 0-0592 0 , Ge (sso 0:3987 —0-7796]. 
0 0 — 00-8067 0-4499 —0-8768 —0-1697 
8-1. Normal case. As explained in Mallows (1959), the matrix X (or rather the unrounded 


form of it) is the matrix of second-order regression coefficients (B;;) obtained by fitting a 
quadratic response surface 


3 3 
fly) = Bot ZX Biyir 2 Pisyids (8-1) 


to data given by Box & Youle (1955). The experimental design used was central composite in 
k = 3 dimensions, but was not second-order rotatable; if it had been, then the invariant 
covariance structure (2-2) would have been appropriate. However, the departure from 
rotatability does not seem serious in this case, and we shall proceed on the assumption that 
it is negligible. Thus we shall assume (2-2) to hold, with 


v = 0:02539, c= 0-06197 


(these quantities are characteristics of the design used). An estimate of o?, based on f = 5 
d.f. is s? = 4-848. 
Box & Youle reduced the fitted surface to canonical form; this gives 


f(y) = by+ Loy, + XI; hi, 
u t 


where h, = g/y, i.e. h = G’y as before. The methods of this paper can be used to give 
confidence regions for the latent vectors of the matrix (f;;), i.e. for the directions of the true 
canonical variables; and joint regions for the latent vectors and roots. 

Using the statistic y,(y) given in (4-3) with 0? replaced by its estimate s*, we have for 95 % 
confidence the region fy: y?K2y —(y7Xy)? < (k—1) vs*F}, (8-2) 
where F, = Fy; («) =5:79 when k=3, f=5, a=0-95. 

An approximate region with exactly elliptical caps is given by (7-5) and (7-6); the cap Riy 
centred on the first l.v. g, of X (i.e. the first column of G) is the intersection of the unit 
sphere with an elliptical cylinder with axis pointing along ¢,, and with semi-axes 
8, = {((k—1) vs*F 5} |, -1,| (i= 2, 3), (8-3) 
{s,,} = 0-371, 0-294. 
Here FY, = F,(1+ 0-045) = 6-05. 








148 C. L. MaLtLows 


These semi-axes point along the other two l.v.’s of X. Similarly, the semi-axes of the other 
two caps RF, and Rs are found to be 


{so} = 0-409, 1:55, — {89;} = 0-335, 1-605 
the corresponding F'*’s being 
F*, = 1-27F, = 7-35, F%, = 1-36 F, = 7-85. 


8-2. Discussion. (i) Both of the semi-axes of RF, are fairly small; the extremes of these 
axes correspond to angular deviations from §, of 22° and 17°, respectively. The neglected 
term in (6-3) is at most 14 % of the term retained. Also the adjustment Fj}, — F, is small. 
We may therefore feel assured that this cap is a good approximation to a 95 % confidence 
region for the l.v. of = associated with the largest L.r. 

(ii) However R%, and R73 have each one semi-axis which exceeds unity; both of these lie 
in the same plane (orthogonal to g,). Obviously the approximation involved in deriving (7-6) 
is very suspect in these circumstances. 

(iii) Investigating the exact 95 % confidence region (8-2), we find that the cap containing 
¢, is approximately elliptical; the semi-diameters in the directions of §, and g, (for com- 
parison with the semi-axes {s,,} in (8-3)) are 


0-395, 0-301. 


The rest of the region consists of a band round the sphere, of varying width. 

(iv) In this problem it appeared likely a priori that two of the latent roots of = are zero. 
Replacing /, and 1, by zero in (8-3), we obtain as an approximate 95 % confidence region 
for the latent vector associated with the remaining latent root, a circular cap on the unit 
sphere, centred on §,, with radius 


{(k—1) vs?F'*}4 |1,|-1 = 0-366. 


This value differs slightly from that given in T.R. 35, since there the unrounded version of 
X was used, and the correction F'* — F was not applied. 

(v) A joint confidence region for (y; A) can be obtained from the statistic y,(; A) in (4-3). 
With 95 % confidence, y lies in a region of the same shape as (8-2), but with now 


F, = Ff, f{«) = 541 with k=3, f=5, a=0-95. 


For each such y, the corresponding latent root A lies in an interval centred on A = y7TXy, 
with length given by (6-5) (with o? replaced by its estimate, and K, = kF(a)). Thus for 


Y = &,, we have 
W(;)=0, A=1, = 3-3475, d = 2-9785, 


so that the interval for A is (0-369, 6-326). The length of the interval for A decreases to zero as 
y approaches the boundary of its region. 


8.3. Rounding-off case. For the purposes of an example, we shall now disregard the 
underlying normal variation in the elements of X, and shall consider only the effect of the 
rounding-off errors. We assume these to be independent random variables, uniformly 
distributed in (— 0-05, + 0-05). Thus the rounding-off structure (2-7) is appropriate, with 


ao? = 1(0-05)? = 0-000833, «, = —1-2. 











wk 


TI 


an 


Sir 


no 
pa 


oO + wat O 


it 


of 


Y> 
or 


as 


he 
he 
ly 
th 











Latent vectors of random symmetric matrices 149 


The construction of approximate confidence regions for the latent vectors is described in 
§ 6.7; they consist of three cap-pairs; each of these being centred on a l.v. of X and having 
the same shape as in the normal case above, but with different relative sizes. 

Taking first the cap containing the l.v. g,, we find 


D4, = 0-4138, Lg, = 60-2007, Tz, = 0-1060, 
whence 6, = 0-5673, f, = 24922. 
Thus for 95 % confidence, we take in (6-6) 
K,o~* = 0, x3 (0-95) = 3-925 
and the resulting cap is elliptical with semi-axes 
{s,,} = {K}|l,-1,|>} = 0-0174, 0-0138. 
Similarly for the 95 % caps containing ¢, and §3, we find 


6, = 0-5766, fz = 28121, {s,,} = 0-0182, 0-0693; 
0 = 0-5835, fs = 2°5746, {s5,} = 0-0141, 0-0677. 


Here we have not applied the adjustment in (7-4); strictly it is not applicable to the present 
non-normal case, but in any event the effect it is designed to counteract is here small com- 
pared with the approximation involved in using 07 to represent the distribution of y,(y). 


REFERENCES 


ANDERSON, T. W. (1958) An introduction to multivariate statistical analysis. New York: John Wiley 
& Sons. 

BEALE, E. M. L. (1960). Confidence regions in non-linear estimation. J. R. Statist. Soc. B, 22, 41-76 
(with discussion, pp. 76-88). 

Box, G. E. P. (1949). A general distribution theory for a class of likelihood ratio criteria. Biometrika, 
36, 317-46. 

Box, G. E. P. & Youtz, P. V. (1955). The exploration and exploitation of response surfaces: An 
example of the link between the fitted surface and the basic mechanism of the system. Biometrics, 
11, 287-323. 

LAawLey, D. N. (1956). Tests of significance for the latent roots of covariance and correlation matrices. 
Biometrika, 43, 128-36. 

Matiows, C. L. (1959). Latent roots and vectors of random matrices. Technical Report, No. 35, 
Statistical Techniques Research Group, Princeton University. (Referred to as T.R. 35.) 

Van DER Vaart, H. R. (1958). Some results on the probability distribution of the latent roots of 
a symmetric matrix of continuously distributed elements, and some applications to the theory 
of response surface estimation. Technical Report No. 18, Institute of Statistics, University of 
North Carolina. 





—_— eS SS i a ee) ee = 


B 
P: 
d 
17 
t 
Vv 
Vv 
s 
I 

















Biometrika (1961), 48, 1 and 2, p, 151 151 
Printed in Great Britain 


Expected values of normal order statistics 


By H. LEON HARTER 


Aeronautical Research Laboratory 
Wright-Patterson Air Force Base 


1. History 


The problem of order statistics has received a great deal of attention from statisticians 
dating at least as far back as a paper by Karl Pearson (1902) giving a solution of a general- 
ization of a problem propos<d by Galton (1902). The generalized problem is that of finding 
the average difference between the pth and the (y+ 1)th individuals in a sample of size n 
when the sample is arranged in order of magnitude. The result is 


n 


wap | 327-2 an wih 


Zz 
where a = { (x) dx and (x) is the probability density function of the variable x. Pearson 


stated a theorem, which he attributed to W. F. Sheppard, that the average differences 
between successive individuals are the successive terms in the binomial expansion of 


[- {a+(1+a)}"dzx. (1-2) 


In a footnote, Pearson remarked, ‘Clearly a knowledge of the average difference in character 
of two adjacent individuals involves also a knowledge of the average difference in character 
between any two individuals’. For a symmetric population, such knowledge also involves 
a knowledge of the expected values of all the order statistics, since for odd sample sizes 
n = 2k+1, where k is an integer, H(2,,,) = (the population mean), while for even sample 
sizes n = 2k, MB (2,) + Blap41)] = me 

Irwin (1925) gave expressions for the mean difference between the pth and qth individuals 
in order of magnitude and for the moments of the frequency distribution of differences 
between consecutive individuals. Tippett (1925) published a seven-decimal-place table of 
the probability integral of the largest individual in samples of size n from a normal popula- 


tion for n = 3,5,10 and x = —2-6(0-2) 5:8; 


n = 20,30,50 and x =—0-1(0-1)6-0; nm = 100(100) 1000 and x = 1-0(0-1)6-5. 


The same paper included a five-decimal-place table of the mean range of samples of size 
nm = 2(1) 100 from a normal population from which the expected values of the largest and 
smallest individuals could of course be derived. The expected values of normal order 
statistics other than the first and last were not computed until somewhat later. 

Karl Pearson & Margaret V. Pearson (1931) obtained an expansion in Taylor series for 
E(x,), accurate to 5 or 6 decimal places for |£(x;)| not too large (say < 1). Fisher & Yates 
(1938, Table XX) published a two-decimal-place table of the expected values of all normal 
order statistics for samples of size n = 2 (1) 50. Their values are correct except for four errors 
of a unit in the last place, due to rounding. Hastings, Mosteller, Tukey & Winsor (1947) 








152 H. Leon Harter 


published a five-decimal-place table of the means and standard deviations of all order 
statistics for samples of size n = 2(1)10 from a normal population, also from a uniform 
population and from a selected long-tailed population. Their values for the means of normal 
order statistics are correct except for n = 10, where there are errors of from 1 to 7 units in 
the last place. 

Wilks (1948) published an expository paper summarizing work on order statistics up to 
that time and listing 90 references. 

Godwin (1949a) published a table of the expected values of rank differences in normal 
samples, to 10 decimal places for n = 2; 9 decimal places for n = 3, 4; 8 decimal! places for 
n = 5; 7 decimal places for n = 6,7; 6 decimal places for n = 8; and 5 decimal places for 
n = 9,10. Godwin (19496) also published a seven-decimal-place table of the means and 
standard deviations of all normal order statistics for samples of size n = 2(1)10. His 
values for the means of the first-order statistics are accurate to 7 decimal places, and his 
other values are probally equally accurate, since they were computed by the same method. 
Cadwell (1953) published a table of moments (mean, variance, /, and /,) and selected per- 
centage points of the first quasi-range for samples of size n = 10(1)30. His values of the 
means are correct except for one error of a unit in the last place, due to rounding. 
E. S. Pearson & Hartley (1954, Table 28) published a table of expected values of normal 
order statistics, to 3 decimal places for n = 2(1) 20 and to 2 decimal places for 

n = 21(1) 26 (2)50; 

values for n = 2(1)10 were compiled from Godwin’s table, those for n = 11(1)20 were 
freshly computed by Jean H. Thompson, while those for n > 20 were taken from the table 
by Fisher and Yates. These values are correct except for three errors of a unit in the last 
place, due to rounding. Harter (1959) published a six-decimal-place table (accurate to 
within a unit in the last place) of the expected values of the range and of the first 8 quasi- 
ranges for samples of size n = 2 (1) 100 taken from a normal population. By dividing these 
values by two, the expectations of the absolute values of thenine largest and theninesmallest 
normal deviates can be obtained. 

Federer (1951) used a somewhat different approach than did most of the aforementioned 
authors, who depended largely on numerical integration for the determination of tabular 
values. Federer made use of the recurrence formula 


Ep, c41) = + {ME p-1,4) — (1) Elin.) (1-3) 


where ~,, ; is the ith largest deviate from a sample of size m. Starting from Tippett’s table 
of expected values of the range, Federer computed three-decimal-place values of the three 
largest normal deviates for samples of size n = 41(1)200 and two-decimal-place values 
of the fourth largest normal deviate for n = 41(1)200 and of the fifth largest normal 
deviate for m = 41(1)100. Because of loss of accuracy with repeated application of the 
recurrence formula, some of Federer’s values are in error by from 1 to 3 units in the last 
place, and it is evident that the form of the recurrence formula given by (1-3) is of little 
value in computation. The author is indebted to the Editor for pointing out that, if written 
in the form 


Et y_1,¢) = — (UE (p,cxa) + (tm — 8) Bley, (1-4) 


the recurrence formula can be used for working downwards with no serious accumulation 








— 
rh 


2 od f&S f& rs 


ma Ore 


1 











Expected values of normal order statistics 153 


of rounding errors. Similar recurrence formulae for the variance and covariance of order 
statistics have recently been obtained by Govindarajulu (1959). 


2. METHOD OF COMPUTATION 


The expected value of the kth largest observation in a sample of size n from a standard 
normal population (mean zero and variance one) is given by the equation 


n! 
Fem) = @—HrET | 


@ 


_ Md PETE + O(a) * G(x) dx, (2-1) 


where (x) = (277)-+e-4”* and (zx) = [oe dx. The expected value of the kth smallest 
0 


observation is given by the same expression preceded by a minus sign, so that for a given 
value of n it is necessary only to compute the expected values for k = 1(1)[4n]. This was 
done by numerical integration on the Univac Scientific (ERA 1103A) computer, for 
nm = 2(1)100 and for values of n, none of whose prime factors exceeds seven, up through 
n = 400. Values of log,,”! for n = 1(1) 400 from a table by Pearson & Hartley (1954) and 
values of (x) = ¢(—2) for x = 0(0-05) 7-60, 2O(2) = —2(—2) for x = 0(0-05) 5-95, and 
1—20(x) = 1+20(— 2) for x = 6-00 (0-05) 7-60 from tables by the National Bureau of 
Standards (1953, Tables I and II) were read into the computer. For each pair of values of 
n and k, the product J(n, k, x) of the multiplicative constant and the integrand was deter- 
mined for x = — 7-60 (0-05) 7-60 by computing elog, I(n, k, x), where 

log, I(n, k, x) = log, n! —log, (n —k)! —log, (k—1)!+ log, x + (k—1) log, [4 — ®(x)] 

+ (n—k) log. [3+ O(x)]+log, d(x). (2-2) 

Fixed-point binary arithmetic was used, and the numbers were scaled so as to retain as 
much accuracy as possible. Since I(n, k, x) is zero (to the number of places carried in the 
computer) for all values of n and k when |z| > 7-60, the resulting value of E(x,,,,), obtained 
by using either the trapezoidal rule or the seven-point Lagrangian integration formula, 
is found by summing I(n,k, x) for x = —7-60(h) 7-60 and multiplying by the interval, h. 
Results were computed and printed out (to seven decimal places) for h = 0-05 and h = 0-10, 
and agreement is sufficiently close to guarantee that the values of E(x,,,,) for h = 0-05 are 
accurate to within a unit in the fifth decimal place. Accordingly, the values for h = 0-05 
were rounded to five decimal places, and the five-decimal-place values were punched on 
cards and printed on the IBM 407 tabulator. The results for n = 2 (1) 100 (25) 250 (50) 400 
are shown in Table 1. 

Acknowledgment, with thanks, is made to Eugene H. Guthrie, who programmed the 
problem for computation on the ERA 1103 A. 


3. BLoM’s APPROXIMATION 
In 1954 Blom became interested in the problem of plotting points on normal probability 
paper and, after reading a paper by Chernoff & Lieberman (1954), in the related problem of 
estimating parameters by means of linear functions of order statistics, Blom (1958) proposed 
approximating the ith normal order statistic (ith smallest normal deviate) for a sample of 
size n by means of the relation 


E(x,) = © (“sn): (3-1) 











154 H. Leon HARTER 


where ®(x) = # ¢(x) dx, with $(x) = (27)-te-4**. Note that ®(x) is defined differently 


here than in § 2. It should be mentioned that there has been an argument of long-standing 
between advocates of the approximations corresponding to « = 0 and a = 0-5, neither of 
which is correct. Blom tabulated the value of « required to yield the correct value of E(x;) 
for i = 1(1)[4n] when n = 2 (2) 10(5) 20. The values of « increase as n increases, the lowest 
value being 0-330 for n = 2,7 = 1. For a given n, « is least for i = 1, rises quickly to a peak 


Table 2. Values of «, ,, such that E(x;) = O[(i—a, ,)/(n— 2a; ,+1)] 


a n=25 n=50 n= 100 n = 200 n= 400 a n= 100 n = 200 n = 400 
1 0-377 0-384 0-391 0-396 0-401 30 0-394 0-404 0-414 
2 +394 -403 *412 -419 -426 35 +393 +402 *412 
3 +395 +405 “415 +423 -430 40 +392 -400 -410 
4 +394 -405 “415 *424 -431 45 391 -398 -408 
5 -392 +403 -414 *423 431 50 391 *397 *407 
6 0-391 0-402 0-412 0-422 0-430 55 — 0-396 0-405 
7 -390 -400 ‘411 421 *429 60 — +395 +404 
8 +389 -399 -410 -420 -429 65 — +394 -403 
9 -388 *398 -408 -418 -428 70 — +394 -402 

10 *388 +397 -407 “417 427 75 — +393 -401 

11 0-387 0-396 0-406 0-416 0-426 80 — 1-393 0-400 
12 *387 +395 -405 “415 425 85 —_ -392 -399 
13 _— *394 -404 *414 -424 90 — *392 +399 
14 — +393 -403 *414 423 95 —_ 391 -398 
15 —_ -393 -402 -413 -423 100 — +391 +398 
16 —_— 0-392 0-402 0-412 0-422 110 —_— — 0-396 
17 — +392 -401 411 421 120 — —_ +396 
18 —_ *391 -400 -410 -420 150 — — +395 
19 —_ 391 -399 -410 -420 140 — _ +394 

20 _ +391 +399 -409 -419 150 — — +394 

21 _ 0-390 0-398 0-408 0-419 160 — — 0-393 

22 — +390 398 -408 -418 170 —_— —_ 393 

23 —_— +390 +397 -407 “417 130 —_ _— +392 

24 — +390 +397 -407 417 190 — — +392 
25 — -390 *396 -406 -416 200 — — *391 





for a relatively small value of 7, and then drops off slowly; as an example, for n = 20, 
a = 0°374 fori = 1, the peak value of « is 0-391 fori = 3, and a drops to 0-386 fori = 8, 9, 10. 
Blom conjectured that « always lies in the interval (0°33, 0-50). He suggested the use of 
a = 3 as a compromise value. 

If one solves (3-1) for the value of « required to yield the correct value ot E(x;) for given 


i and n, one obtains : 
i-—-(n+1) ®[E(x;)] 


Yom = 1 20[B(x,)) wits 
Values of «; ,, for i = 1(1)[4n] when n = 25, 50, 100, 200, 400 have been computed on the 
Burroughs E 101-3 computer, and the results, rounded to three decimal places, are shown 
in Table 2. For brevity, results have been given only for values of ¢ which are multiples of 5 
for i between 25 and 100 and multiples of 10 for i between 100 and 200. A glance at the values 


in Table 2 is sufficient to show that the compromise value of 3 or 0-375 for « proposed by Blom 








- 


ao = &—- © A — CO Ff DD ff 








Expected values of normal order statistics 155 


is too low except for small values of n. If, however, one wishes to minimize the maximum 
error in estimating E(x;) for n < 400, one is led to choose a value of a even small than 3, 
since the estimate of E(x;) is much more sensitive to changes in « for small values of n (and 7) 
than for large values. The maximum error in estimating H(x,) is minimized by choosing 
a = 0-363. This gives a maxiraum error of 0-018, which is hardly satisfactory. It is possible, 
however, to do a fairly good job of estimating E(x;) by choosing one or two compromise 
values of « for each n. One can choose a single compromise value, «,,, foreach n, to be used for 
all values of 7, and simultaneously insure that the error in H(x;) does not exceed four units in 
the third decimal place. If one uses a, ,, to estimate E(x,) and a, ,, to estimate E(x;) fori + 1, 
the error in H(x;) will not exceed one unit in the third decimal place. Values of «,, a, and a» » 
are given in Table 3 for n = 2 (2) 10(5) 25, 50, 100, 200, 400 along with regression equations 


Table 3. Compromise values of a 


n an Sn aon 
2 0-330 0-330 a 
4 +349 *347 0-359 
6 +359 *355 +368 ’ F : 
8 364 -360 374 To estimate « for intermediate values of n, use the 


following equations: 
10 0-368 0-364 0-378 


o&, = 0-314195 + 0-063336X — 0-010895X?, 
pos bi poo Oy» = 0315065 + 0-057974X — 0-009776X2, 
I< Te +3 sn = 0-327511 + 0-058212X — 0-007909X2, 

where X = login. 
50 0389 «(03840408 

100 +396 -391 412 

200 -402 +396 419 

400 -407  -401 +426 


to be used for intermediate values of n. Values of « found by substituting tabular values of 
n in these regression equations do not differ from the corresponding tabular values of « 
by more than two units in the third decimal place, and this error in a does not increase the 
error in H(x;) by more than one unit in the third decimal place. There is reason to believe 
that results for intermediate values of n will be equally good, but use of these equations for 
n > 400 is emphatically discouraged. Thus, if one wishes to interpolate for intermediate 
values of n, the maximum errors are two units in the third decimal place for the approxi- 
mation based on a single compromise value of « and five units in the third decimal place for 
the approximation based on two compromise values of ~. These errors compare with a 
maximum error of between one and two units in the third decimal place for linear inter- 
polation between successive value of for a given i(k) in Table 1. Comparison of the maxi- 
mum errors might lead to the conclusion that interpolation in Table 1 is always moreaccurate 
than interpolation using Blom’s approximation This would be erroneous, since the maxi- 
mum error for the former occurs for large values of ¢ (near 4n), while the maximum error 
for the latter occurs for small values of 7. Interpolation using Blom’s approximation for 
large values of 7, especially when the desired value of n lies about midway between widely 
separated successive tabular values of n (for example, when n = 232), and interpolation in 
Table 1 otherwise will limit the error to no more than a unit in the third decimal place. 
If more accurate values are required, they should be computed in the same way that 
Table 1 was computed, as should values for n > 400, or else they should be computed by 














H. Leon Harter 





156 


working downwards from the next higher tabular value of n, using the recurrence formula 
(1-4). Table 4 summarizes the above results, giving maximum errors in determining E(zx;) 
by various methods. 


Table 4. Maximum errors in determining E(x;) by various methods 


Values of n Intermediate 
Method in Table 3 values of n 

Blom’s approximation: 

a = 0-363 for all values of n 0-018 0-018 

One value of « for each n -004 005 

Two values of « for each n -001 -002 
Interpolation in Table 1 -002 
Recurrence formula (1-4) -00001 -00001 
Numerical integration (h = 0-05) <-00001 <-00001 


4. APPLICATIONS 


Pearson & Hartley (1954, p. 56) have given two examples of applications of tables of 
expected values of normal order statistics. The first of these is concerned with estimating 
the weight of the five heaviest of 30 lambs at age 2} months, given the mean and standard 
deviation of the population, which is assumed to be normal. The second deals with the use 
of order statistics in estimating the population standard deviation. Pearson & Hartley 
and also Fisher & Yates (1953, p. 76) mention the use of expected values of normal order 
statistics in the analysis of variance of ranked data. The potential use of expected values of 
normal order statistics for transformation to standard normal scores preliminary to the 
analysis of variance was the principal motivation for the present study. In cases where only 
the rank of the observations is known, there is no reasonable alternative to transformation 
to standard normal scores, but the usefulness of this method is not restricted to such cases. 
When the data are known to have come from a population which does not satisfy the assump- 
tions underlying the analysis of variance, of which normality is one, or when the data them- 
selves give a strong indication to that effect, the experimenter seeks a transformation which 
will minimize or eliminate departures from the assumptions. One transformation which 
should be considered is the transformation to standard normal scores, and a preliminary 
investigation has shown that this transformation has some very desirable properties; in 
some cases it reduces both non-additivity and non-homogeneity of variance to lower levels 
than does any transformation of the form (x+c)?. It has, of course, the obvious disadvan- 
tage of not being reversible. 


REFERENCES 


Biom,G uNNAR (1958). Statistical Estimates and Transformed Beta-Variables. New York: John Wiley 
and Sons, Inc. 

CapwELL, J. H. (1953). The distribution of quasi-ranges in samples from a normal vopulation. Ann. 
Math. Statist. 24, 603-13. 

CuERNoFr, HERMAN & LIEBERMAN, GERALD J. (1954). Use of normal probability paper. J. Amer. 
Statist. Ass. 49, 778-85. 

FEDERER, WALTER T. (1951). Evaluation of Variance Components from a Group of Experiments with 
Multiple Classifications. Iowa Agricultural Experiment Station Research Bulletin No. 380. 

FisHer, Ronatp A. & Yates, FRANK (1953). Statistical Tables for Biological, Agricultural and Medical 
Research, 4th edition (1st edition, 1938). Edinburgh: Oliver and Boyd. 

Gatton, Francis (1902). The most suitable proportion between the values of first and second prizes. 

Biometrika, 1, 385-90. 











Thre 





la 














Expected values of normal order statistics 157 


Gopwin, H. J. (1949a). On the estimation of dispersion by linear systematic statistics. Biometrika, 
36, 92-100. 

Gopwin, H. J. (19496). Some low moments of order statistics. Ann. Math. Statist. 20, 279-85. 

GOVINDARAJULU, Z. (1959). On moments of order statistics from normal populations (abstract). 
Ann. Math. Statist. 30, 617. 

Harter, H. Leon (1959). The use of sample quasi-ranges in estimating population standard deviation. 
Ann. Math. Statist. 30, 980-99. 

Hastines, Crecit, JR., MOSTELLER, FREDERICK, TUKEY, JoHN W. & Winsor, CHARLES, P. (1947). 
Low moments for small samples: a comparative study of order statistics. Ann. Math. Statist. 18, 
413-26. 

Irwin, J. O. (1925). The further theory of Francis Galton’s individual difference problem. Biometrika, 
17, 100-28. 

NaTionaL BuREAU OF STANDARDS (1953). Tables of Normal Probability Functions. Applied Mathe- 
matics Series No. 23. 

Pearson, E. 8S. & Hartiey, H. O. (1954). Biometrika Tables for Statisticians, 1. Cambridge University 
Press for the Biometrika Trustees. 

Prarson, Kart (1902). Note on Francis Galton’s problem. Biometrika, 1, 390-9. 

PEARSON, Kart & PEARSON, MARGARET, V. (1931). On the mean character and variance of a ranked 
individual, and on the mean and variance of the intervals between ranked individuals. Biometrika, 
23, 364-97. 

Tippett, L. H. C. (1925). On the extreme individuals and the range cf samples taken from a normal 
population. Biometrika, 17, 364-87. 

Wir«s, 8S. 8S. (1948). Order statistics. Bull. Amer. Math. Soc. 54, 6-50. 


ADDENDUM 


An account of methods of computing expected values of normal order statistics would 
be incomplete without mention of the series expansions worked out by David & Johnson 
(1954) and by Plackett (1958). Saw (1960) has made a comparison of the David—Johnson 
series and the Plackett series. Neither series seems particularly well adapted to the 
computation of tables of the sort included in this paper, though either would be quite 
useful in obtaining very accurate expected values for isolated cases. The author wishes 
to thank Dr F. N. David for drawing his attention to these papers. 


REFERENCES 


Davin, F. N. & Jonnson, N. L. (1954). Statistical treatment of censored data. Biometrika, 41, 228—40. 

Priackett, R. L. (1958). Linear estimation from censored data. Ann. Math. Statist. 29, 131-42. 

Saw, J. G. (1960). A note on the error after the number of terms of the David—Johnson series for 
the expected values of normal order statistics. Biometrika, 47, 79-86. 


Table 1. Expected values of normal order statistics (see overleaf) 


2) 


— _ —— ia k-1 n—k 
Bete) = @—HiERIi | tt CPA + O~* ola) ade, 
where d(x) = (27)-te-4#”*_ and §=O(x) = | * bla) dx. 
0 
[Tabular values are the expected values of the kth largest normal deviate for a sample of 


size n from N(0, 1); or when preceded by a minus sign, they are the expected values of the 
kth smallest normal deviate.] 








Table 1. Expected values of normal order statistics 





















































































































158 
n 
2 3 4 5 6 7 8 9 
Ie 
‘ 0-56419 | 0-84628 | 1-02938 | 1-16296 | 1-26721 | 1-35218 | 1-42360 | 1-48501 
2 wi -00000 | 0-29701 | 0-49502 | 0-64176 | 0-75737 | 0-85222 | 0-93230 
3 pase << —_ 00000 | -20155 | -35271 | -47282 | -57197 
4 on oe ane — an -00000 | -15251 | -27453 
5 — _ ~ — eo = _ -00000 
\ n 
10 11 12 13 14 15 16 17 18 19 
k 
1 | 1-53875 | 1-58644 | 1-62923 | 1-66799 | 1-70338 | 1-73591 | 1-76599 | 1-79394 | 1-82003 | 1-84448 
2 | 1-00136 | 1-06192 | 1-11573 | 1-16408 | 1-20790 | 1-24794 | 1-28474 | 1-31878 | 1-35041 | 1-37994 
3 | 0-65606 | 0-72884 | 0-79284 | 0-84983 | 0-90113 | 0-94769 | 0-99027 | 1-02946 | 1-06573 | 1-09945 
4 | -37576| -46198 | -53684 | -60285| -66176| -71488 | -76317 | 0-80738 | 0-84812 | 0-88586 
5 | -12267| -22489| -31225| -38833 | -45557| -51570| -57001 | -61946 | -66479| -70661 
6 — | 0-00000 | 0-10259 | 0-19052 | 0-26730 | 0-33530 | 0-39622 | 0-45133 | 0-50158 | 0-54771 
7 oe _ _ 00000 | 0-08816 | -16530 | -23375 | -29519 | -35084 | -40164 
8 ae ane a ~~ down 00000 | -07729 | -14599 | -20774| -26374 
9 a“ — a _ — — —_ -00000 | -06880 | -13072 
10 toms _ _ ~ a — ~ oan — -00000 
n 
ise 20 21 22 23 24 25 26 27 28 29 
k 
1 | 1-86748 | 1-88917 | 1-90969 | 1-92916 | 1-94767 | 1-96531 | 1-98216 | 1-99827 | 2.01371 | 2-02852 
2 | 1-40760 | 1-43362 | 1-45816 | 1-48137 | 1-50338 | 1-52430 | 1-54423 | 1-56326 | 1-58145 | 1-59888 
3 | 1-13095 | 1-16047 | 1-18824 | 1-21445 | 1-23924 | 1-26275 | 1-28511 | 1-30641 | 1-32674 | 1-34619 
4 | 0-92098 | 0-95380 | 0-98459 | 1-01356 | 1-04091 | 1-06679 | 1-09135 | 1-11471 | 1-13697 | 1-15822 
5 | -74538 | -78150| -81527 | 0-84697 | 0-87682 | 0-90501 | 0-93171 | 0-95705 | 0-98115 | 1-00414 
6 | 0-59030 | 0-62982 | 0-66667 | 0-70115 | 0-73354 | 0-76405 | 0-79289 | 0-82021 | 0-84615 | 0-87084 
7 | -44833| -49148| -53157| -56896 | -60399 | -63690/ -66794| -69727| -72508| -75150 
8 | -31493 | -36203 | -40559| -44609| -48391 | -51935 | -55267| -58411 | -61385 | -64205 
9 | -18696 | -23841 | -28579 | -32965 | -37047| -40860| -44436| -47801 | -50977| -53982 
10 | -06200| -11836 | -16997| -21755| -26163| -30268| -34105 | -37706| -41096| -44298 
11 — | 0-00000 | 0-05642 | 0-10813 | 0-15583 | 0-20006 | 0-24128 | 0-27983 | 0-31603 | 0-35013 
12 ee —_ — 00000 | -05176 | -09953 | -14387| -18520| -22389| -26023 
13 nie _ a it ae 00000 | -04781 | -09220 | -13361 | -17240 
14 oe one _ — am — nn 00000 | -04442 | -08588 
i5 ane _ — ~~ on — ~ — — -00000 
n 
30 31 32 33 34 35 36 37 38 39 
k 
1 | 2-04276 | 2-05646 | 2.06967 | 2-08241 | 2-09471 | 2-10661 | 2-11812 | 2-12928 | 2-14009 | 2-15059 
2 | 1-61560 | 1-63166 | 1-64712 | 1-66200 | 1-67636 | 1-69023 | 1-70362 | 1-71659 | 1-72914 | 1-74131 
3 | 1-36481 | 1-38268 | 1-39985 | 1-41637 | 1-43228 | 1-44762 | 1-46244 | 1-47676 | 1-49061 | 1-50402 
4 | 1-17855 | 1-19803 | 1-21672 | 1-23468 | 1-25196 | 1-26860 | 1-28466 | 1-30016 | 1-31514 | 1-32964 
5 | 1-02609 | 1-04709 | 1-06721 | 1-08652 | 1-10509 | 1-12295 | 1-14016 | 1-15677 | 1-17280 | 1-18830 
6 | 0-89439 | 0-91688 | 0-93841 | 0-95905 | 0-97886 | 0-99790 | 1-01624 | 1-03390 | 1-05095 | 1-06741 
7 | -77666 | -80066 | -82359| -84555 | -86660| -88681 | 0-90625 | 0-92496 | 0-94300 | 0-96041 
8 | -66885 | -69438 | -71875 | -74204| -76435 | -78574| -80629| -82605 | -84508| -86343 
9 | -56834| -59545 | -62129| -64596 | -66954| -69214| -71382| -73465| -75468| -77398 
10 | -47329 | -50206 | -52943 | -55552 | -58043 | -60427/| -62710| -64902 | -67009| -69035 
11 | 0-38235 | 0-41287 | 0-44185 | 0-46942 | 0-49572 | 0-52084 | 0-54488 | 0-56793 | 0-59005 | 0-61131 
12 | -29449 | -32686| -35755 | -38669| -41444 | -44091 | -46620| -49042| -51363| -53592 
13 | -20885 | -24322 | -27573 | -30654 | -33582| -36371 | -39032| -41576| -44012| -46348 
14 | -12473| -16126| -19572| -22832| -25924| -28863| -31663| -34336| -36892| -39340 
15 | -04148 | -08037 | -11695 | -15147| -18415 | -21515| -24463| -27272| -29954| -32520 
16 — | 0-00000 | 0-03890 | 0-07552 | 0-11009 | 0-14282 | 0-17388 | 0-20342 | 0-23159 | 0-25849 
17 on ~ _ 00000 | -03663 | -07123 | -10399| -13509| -16469 | -19292 
18 am wee om ns 00000 | -03461 | .06739 | -09853 | -12817 
—_ ~ _ _ — _ 00000 | -03280 | -06395 
— _ ~ aw _ ow a — -00000 





“wT 














= Ow we Ww tw 


COO ww Se el 





Table 1 (cont.) 
























































n 
40 41 42 43 44 45 46 47 48 49 
k 
1 | 2-16078 | 2-17068 | 2-18032 | 2-18969 | 2-19882 | 2.20772 | 2-21639 | 2-22486 | 2-23312 | 2-24119 
2 | 1-75312 | 1-76458 | 1-77571 | 1-78654 | 1-79707 | 1-80733 | 1-81732 | 1-82706 | 1-83655 | 1-84582 
3 | 1-51702 | 1-52964 | 1-54188 | 1-55377 | 1-56533 | 1-57658 | 1-58754 | 1-59820 , 1-60860 | 1-61874 
4 | 134368 | 1-35728 | 1-37048 | 1-38329 | 1-39574 | 1-40784 | 1-41962 | 1-43108 | 1-44224 | 1-45312 
5 | 1-20330 | 1-21782 | 1-23190 | 1-24556 | 1-25881 | 1-27170 | 1-28422 | 1-29641 | 1-30827 | 1-31983 
6 | 1-08332 | 1-09872 | 1-11364 | 1-12810 | 1-14213 | 1-15576 | 1-16899 | 1-18186 | 1-19439 | 1-20658 
7 | 0-97722 | 0-99348 | 1-00922 | 1-02446 | 1-03924 | 1-05358 | 1-06751 | 1-08104 | 1-09420 | 1-10701 
8 *88114 *89825 | 0-91480 | 0-93082 | 0-94634 | 0-96139 | 0-97599 | 0-99018 | 1-00396 | 1-01737 
9 *79259 “81056 *82792 *84472 -86097 -87673 *89201 -90684 | 0-92125 | 0-93525 
10 -70988 *72871 -74690 -76448 -78148 “79795 *81391 *82939 "84442 *85902 
11 | 0-63177 | 0-65149 | 0-67052 | 0-68889 | 0-70666 | 0-72385 | 0-74049 | 0-75663 | 0-77228 | 0-78748 
12 -55736 *57799 ‘59788 -61707 -63561 *65353 -67088 -68768 *70397 -71978 
13 *48591 -50749 -52827 -54830 -56763 -58631 -60438 -62186 -63881 *65523 
14 -41688 -43944 -46114 -48204 -50220 -52166 -54046 -55865 *57625 -59331 
15 +34978 *37337 *39604 -41784 -43885 -45912 -47868 -49759 -51588 -53360 
16 | 0-28423 | 0-30890 | 0-33257 | 0-35533 | 0-37723 | 0-39833 | 0-41868 | 0-43834 | 0-45734 | 0-47573 
17 21988 24569 27043 *29418 *31701 +33898 -36016 +38060 -40034 -41942 
18 15644 18345 20931 *23411 *25792 -28081 -30285 *32410 *34460 -36441 
19 -09362 12192 14897 *17488 *19972 +22358 *24652 *26862 *28992 -31049 
20 03117 06085 08917 *11625 -14219 -16707 -19097 +21396 23610 +25746 
21 _ 0-00000 | 0-02969 | 0-05803 | 0-08513 | 0-11109 | 0-13600 | 0-15993 | 0-18296 | 0-20514 
22 _ _— _ -00000 02835 -05546 “08144 -10637 -13033 -15338 
23 — _ —_ — —_ -00000 02712 -05311 -07805 -10203 
24 — <= — —- — —_ _ -00000 -02599 -05095 
25 —_ _- — as — _— —_ —_ — “00090 
n 
50 51 52 53 54 55 56 57 58 59 
k \ 
1 | 2-24907 | 2-25678 | 2°:26432 | 2-27169 | 227891 | 2-28598 | 2-29291 | 2.29970 | 2-30635 | 2-31288 
2 | 1-85487 | 1-86371 | 1-87235 | 1-88080 | 1-88906 | 1-89715 | 1-90506 | 1-91282 | 1-92041 | 1-92786 
3 | 1-62863 | 1-63829 | 1-64773 | 1-65695 | 1-66596 | 1-67478 | 1-68340 | 1-69185 | 1-70012 | 1-70822 
4 | 1-46374 | 1-47409 | 1-48420 | 1-49407 | 1-50372 | 1-51315 | 1-52237 | 1-53140 | 1-54024 | 1-54889 
5 | 1-33109 | 1-34207 | 1-35279 | 1-36326 | 1-37348 | 1-38346 | 1-39323 | 1-40278 | 1-41212 | 1-42127 
6 | 1-:21846 | 1-23003 | 1-24132 | 1-25234 | 1-26310 | 1-27361 | 1-28387 | 1-29391 | 1-30373 | 1-31334 
7 | 1-11948 | 1-13162 | 1-14347 | 1-15502 | 1-16629 | 1-17729 | 1-18804 | 1-19855 | 1-20882 | 1-21886 
8 | 1-03042 | 104312 | 1-05550 | 1-06757 | 1-07934 | 1-09083 | 1-10205 | 1-11300 | 1-12371 | 1-13419 
9 | 0-94887 | 0-96213 | 0-97504 | 0-98762 | 0-99988 | 1-01185 | 1-02352 | 1-03493 | 1-04607 | 1-05695 
10 *87321 *88701 90045 91354 -92629 | 0-93873 | 0-95086 | 0-96271 | 0-97427 | 0-98557 
11 | 0-80225 | 0-81661 | 0-83058 | 0-84417 | 0-85742 | 0-87033 | 0-88292 | 0-89520 | 0-90719 | 0-91890 
12 *73513 *75004 *76455 -77866 -79240 *80578 “81883 *83155 *84397 *85609 
13 *67117 68666 -70170 -71633 *73057 *74444 “75794 ‘T7111 -78396 -79649 
14 -60986 *62592 *64152 -65668 -67143 “68578 69976 *71337 *72665 -73960 
15 *55077 *56742 -58358 -59928 *61455 -62940 *64385 *65793 -67164 *68502 
16 | 0-49354 | 0-51080 | 0-52755 | 0-54380 | 0-55960 | 0-57495 | 0-58989 | 0-60444 | 0-61860 | 0-63241 
17 -43789 *45578 47312 -48995 -50629 -52217 53761 -55263 -56725 -58150 
18 *38357 40211 -42007 -43749 -45439 -47080 -48675 -50226 -51736 -53205 
19 *33036 *34957 -36818 -38621 -40369 -42065 43713 -45314 -46872 -48388 
20 +27807 -29799 -31726 +33592 -35400 *37154 *38856 -40510 42117 -43681 
21 | 0-22653 | 0-24719 | 0-26716 | 0-28648 | 0-30518 | 0-32331 | 0-34090 | 0-35797 | 0-37456 | 0-39068 
22 17559 *19702 *21772 *23772 *25708 *27583 -29400 *31163 *32875 *34538 
23 12511 14735 -16880 *18953 *20957 -22896 *24774 *26595 +28362 -30078 
24 07494 09803 -12029 *14177 *16252 -18259 -20201 *22082 +23906 +25677 
25 02496 04896 -07206 09434 *11584 13661 -15669 17614 -19498 *21325 
26 _ 0-00000 | 0-02400 | 0-04712 | 0:06940 | 0-09091 | 0-11170 | 0-13180 | 0-15127 | 0-17013 
27 — — — -00000 02312 04541 -06693 ‘08773 *10785 *12733 
28 — _ — — — 00000 -02229 -04382 -06463 -08476 
29 a — — — —_ -00000 02153 04234 
30 — — — — — as == — —_ -00000 




















160 


Table 1 (cont.) 



























































n 
NX 60 61 62 63 64 65 66 67 68 69 
k 
1 | 2-31928 | 2-32556 | 2-33173 | 2-33778 | 2-34373 | 2-34958 | 2-35532 | 2-36097 | 2 ©5652 | 2-37199 
2 1:93516 | 1-94232 | 1-94934 | 1-95624 | 1-96301 | 1-96965 | 1-97618 | 1-98260 | . ©8891 | 1-99510 
3 1-71616 | 1-72394 | 1-73158 | 1-73906 | 1-74641 | 1-75363 | 1-76071 | 1-76767 | | 77451 | 1-78122 
4 1-55736 | 1-56567 | 1-57381 | 1-58180 | 1-58963 | 1-59732 | 1-60487 | 1-61228 | 1-61955 | 1-62670 
5 | 1-43023 | 1-43900 | 1-44760 | 1-45603 | 1-46430 | 1-47241 | 1-48036 | 1-48817 | 1-49584 | 1-50338 
6 1-32274 | 1-33195 | 1-34097 | 1-34982 | 1-35848 | 1-36698 | 1-37532 | 1-38351 | 1-39154 | 1-39942 
7 1-22869 | 1-23832 | 1-24774 | 1-25698 | 1-26603 | 1-27490 | 1-28360 | 1-29213 | 1-30051 | 1-30873 
8 1-14443 | 1-15445 | 1-16427 | 1-17388 | 1-18329 | 1-19252 | 1-20157 | 1-21044 | 1-21915 | 1-22769 
9 1:06760 | 1-07802 | 1-08821 | 1-09819 | 1-10797 | 1-11754 | 1-12693 | 1-13613 | 1-14516 | 1-15401 
10 | 0-99662 | 1-00742 | 1-01799 | 1-02833 | 1-03846 | 1-04838 | 1-05810 | 1-06762 | 1-07696 | 1-08612 
11 | 0-93034 | 0-94153 | 0-95247 | 0-96317 | 0-97365 | 0-98391 | 0-99395 | 1-00380 | 1-01345 | 1-02291 
12 -86793 -87950 -89081 -90187 -91270 -92329 -93367 | 0-94383 | 0-95379 | 0-96355 
13 -80873 -82068 *83237 *84379 -85496 *86590 -87660 -88708 *89735 -90741 
14 -75224 -76459 -77665 -78843 -79996 -81123 *82226 -83306 -84364 *85400 
15 -69807 ‘71081 -72324 -73540 *74727 -75889 -77025 -78138 -79226 -80293 
16 | 064587 | 0-65901 | 0-67183 | 0-68436 | 0-69659 | 0-70856 | 0-72025 | 0-73170 | 0-74290 | 0-75387 
17 -59538 -60893 -62214 -63504 -64764 -65996 -67200 -68377 -69529 -70657 
18 -54637 -56033 -57395 -58723 -60020 -61288 -62526 -63737 -64921 -66080 
19 -49864 -51303 *52705 -54073 -55408 -56712 -57985 -59230 -60447 -61638 
20 -45202 -46685 -48129 -49537 ‘50911 +52252 -53561 -54841 -56091 -57314 
21 0-40637 | 0-42164 | 0-43652 | 0-45101 | 0-46515 | 0-47894 | 0-49240 | 0-50555 | 0-51839 | 0-53095 
22 *36155 *37729 -39260 -40752 -42207 -43625 -45009 -46360 -47680 -48969 
23 *31745 33366 *34944 -36480 -37976 -39435 -40857 -42245 -43601 -44925 
24 -27396 -29066 -30691 *32272 -33812 *35312 *36775 -38201 -39594 -40953 
25 -23098 +24820 -26494 +28122 +29706 -31249 +32753 +34219 -35649 -37045 
26 | 0-18842 | 020618 | 0-22343 | 0-24019 | 0-25650 | 0-27237 | 0-28784 | 0-30290 | 0-31759 | 0-33192 
27 14621 -16452 18230 *19957 -21636 -23269 *24859 -26408 *27917 -29389 
28 10425 +12315 14148 *15927 -17656 -19337 -20973 +22565 -24116 *25627 
29 06248 “08198 10089 -11923 -13704 *15435 -17118 *18755 +20349 *21902 
30 02081 -04096 06047 -07938 -09774 -11556 -13288 -14972 -16611 -18207 
31 — 0-00000 | 0-02014 | 0-03966 | 0-05858 | 0-07694 | 0-09478 | 0-11211 | 0-12896 | 0-14536 
32 — a — -00000 01952 -03844 -05681 -07465 -09199 -10885 
33 oo — _— — — -00000 -01893 -03730 -05514 -07249 
34 — —_— — — —_— — — -00000 -01837 -03622 
35 — — — — — — — — — -00000 
\ n 
\ 70 71 72 73 74 75 76 77 78 79 
\ 
1 2-37736 | 2-38265 | 2-38785 | 2-39298 | 2-39802 | 2-40299 | 2-40789 | 2-41271 | 2-41747 | 2-42215 
2 | 2-00120 | 2-00720 | 2-01310 | 2-01890 | 2-02462 | 2-03024 | 2-03578 | 2-04124 | 2-04662 | 2-05191 
3 | 1-78783 | 1-79432 | 1-80071 | 1-80699 | 1-81317 | 1-81926 | 1-82525 | 1-83115 | 1-83696 | 1-84268 
4 | 1-63373 | 1-64063 | 1-64742 | 1-65410 | 1-66067 | 1-66714 | 1-67350 | 1-67976 | 1-68592 | 1-69200 
5 1-51078 | 1-51805 | 1-52520 | 1-53223 | 1-53914 | 1-54594 | 1-55263 | 1-55921 | 1-56569 | 1-57207 
6 1-40717 | 1-41478 | 1-42226 | 1-42961 | 1-43684 | 1-44395 | 1-45094 | 1-45782 | 1-46459 | 1-47125 
7 1-31680 | 1-32473 | 1-33252 | 1-34017 | 1-34770 | 1-35510 | 1-36237 | 1-36953 | 1-37657 | 1-38350 
8 | 123608 | 1-24431 | 1-25240 | 1-26034 | 1-26815 | 1-27583 | 1-28338 | 1-29080 | 1-29810 | 1-30529 
9 1-16270 | 1-17123 | 1-17961 | 1-18784 | 1-19592 | 1-20387 | 1-21168 | 1-21936 | 1-22691 | 1-23434 
10 1-09511 | 1-10393 | 1-11259 | 1-12110 | 1-12945 | 1-13766 | 1-14572 | 1-15365 | 1-16145 | 1-16912 
11 | 1-03220 | 1-04130 | 1-05024 | 1-05902 | 1-06764 | 1-07610 | 1-08442 | 1-09200 | 1-10063 | 1-10854 
12 | 0-97313 | 0-98252 | 0-99173 | 1-00078 | 1-00966 | 1-01838 | 1-02695 | 1-03537 | 1-04364 | 1-05178 
13 -91728 -92695 ‘93644 | 0-94576 | 0-95490 | 0-96387 | 0-97269 | 0-98135 | 0-98986 | 0-99822 
14 *86416 *87412 -88388 -89346 -90286 -91209 -92115 -93005 -93880 *94739 
15 *81338 +82362 *83366 *84351 *85317 *86265 -87196 -88110 -89008 -89890 
16 | 0-76462 | 0-77514 | 0-78546 | 0-79558 | 0-80550 | 0-81524 | 0-82480 | 0-83418 | 0-84339 | 0-85244 
17 -71761 *72843 -73903 *74942 -75960 -76960 -77940 -78903 -79848 *80776 
18 -67214 -68325 -69413 -70480 *71526 *72551 *73557 *74544 *75512 *76463 
19 -62803 -63943 -65060 66155 *67227 -68279 -69310 *79322 -71314 *72289 
20 -58510 -59681 -60827 -61950 -63050 -64128 -65185 -66222 -67239 *68237 
21 | 0-54323 | 0-55525 | 0-56701 | 0-57852 | 0-58980 | 0-60085 | 0-61168 | 0-62230 | 0-63272 | 0-64294 
22 -50230 -51463 -52669 -53850 -55006 -56138 -57248 -58336 -59403 -60449 
23 -46219 -47484 -48721 -49932 “51117 *52277 -53414 -54528 -55621 -56692 
24 -42281 -43579 -44848 -46089 -47304 -48493 -49657 -50798 *51917 -563013 
25 +38404 +39739 -41041 -42313 -43558 *44777 -45970 -47138 -48283 -49404 










































































Table 1 (cont.) 161 
n 
2 70 71 72 73 74 75 76 77 78 79 
26 | 034591 | 0-35958 | 0-37292 | 0-38597 | 0-39873 | 0-41122 | 0-42343 | 0-43540 | 0-44711 | 0-45859 
27 | -30825 | -32227| -33596 | -34934 | -36242 | -37521 | -38772| -39997| -41196| -42371 
28 | -27102 | -28540 | -29945 | -31317| -32657| -33968 | -35250| -36504| -37731 | -38934 
29 | -23416 | -24893 | -26333 | -27740| -29114| -30457| -31770| -33055| -34311 | -35542 
30 | -19762 | -21277| -22756| -24199| -25608| -26984 | -28329/| -29645| -30931 | -32190 
31 | 0-16134 | 0-17690 | 0-19208 | 0-20688 | 0-22133 | 0-23543 | 0-24922 | 0-26269 | 0-27586 | 0-28875 
32 | -12527| -14125 | -15683 | -17202| -18684| -20130| -21543 | -22923| -24272| -25591 
33 | -083936 | -10579 | -12178 | -13737| -15257| -16740 | -18188 | -19602] -20983| -22334 
34 | -05357 | -07045 | -08688 | -10289| -11848| -13370| -14854| -16303| -17718| -19101 
35 | -01785 | -03520| -05209| -06852| -08453 | -10014 | -11536| -13021 | -14471 | -15888 
36 — | 0-00000 | 0-01736 | 0-03424 | 0-05068 | 0-06670 | 0-08231 | 0-09754 | 0-11240 | 0-12691 
37 +> — — 00000 | -01689 | -03333 | -04935 | -06497 | -08020| -09507 
38 Me: — — = os 00000 | -01644 | -03247 | -04809 | -06333 
39 -- — — _ _— — — 00000 | -01602 | -03165 
40 = an x3 ee a ia Ls =i pas -00000 
\ n 
80 81 82 83 84 85 86 87 88 89 
k 
1 | 2-42677 | 2-43133 | 2-43582 | 2-44026 | 2-44463 | 2-44894 | 2-45320 | 2-45741 | 2-46156 | 2-46565 
2 | 2-05714 | 2-06228 | 2-06735 | 2.07236 | 2-07729 | 2-08216 | 2.08696 | 2-09170 | 2-09637 | 2-10099 
3 | 1-84832 | 1-85387 | 1-85935 | 1-86475 | 1-87007 | 1-87532 | 1-88049 | 1-88560 | 1-89064 | 1-89561 
4 | 1-69798 | 1-70387 | 1-70968 | 1-71540 | 1-72104 | 1-72660 | 1-73209 | 1-73750 | 1-74283 | 1-74810 
5 | 1-57836 | 1-58455 | 1:59065 | 1-59665 | 1-60258 | 1-60841 | 1-61417 | 1-61984 | 1-62544 | 1-63096 
6 | 1-47781 | 1-48428 | 1-49064 | 1-49691 | 1-50309 | 1-50918 | 1-51518 | 1-52110 | 1-52693 | 1-53269 
7 | 1-39032 | 1-39704 | 1-40366 | 1-41017 | 1-41659 | 1-42292 | 1-42915 | 1-43529 | 1-44135 | 1-44732 
8 | 1-31236 | 1-31932 | 1-32617 | 1-33292 | 1-33957 | 1-34611 | 1-35257 | 1-35893 | 1-36520 | 1-37138 
9 | 1-24165 | 1-24884 | 1-25593 | 1-26290 | 1-26977 | 1-27653 | 1-28320 | 1-28976 | 1-29624 | 1-30262 
10 | 1-17666 | 1-18409 | 1-19139 | 1-19859 | 1-20567 | 1-21264 | 1-21951 | 1-22628 | 1-23295 | 1-23952 
11 | 1-11631 | 1-12396 | 1-13148 | 1-13889 | 1-14618 | 1-15336 | 1-16043 | 1-16740 | 1-17426 | 1-18102 
12 | 1-05978 | 1-06764 | 1-07539 | 1-08300 | 1-09050 | 1-09788 | 1-10515 | 1-11231 | 1-11936 | 1-12631 
13 | 1-00644 | 1-01453 | 1-02249 | 1-03031 | 1-03802 | 1-04560 | 1-05306 | 1-06041 | 1-06765 | 1-07478 
14 | 0-95584 | 0-96414 | 0-97231 | 0-98034 | 0-98825 | 0-99603 | 1-00369 | 1-01122 | 1-01865 | 1-02596 
15 | -90757 | -91609 | -92447| -93271 | -94082 | -94880 | 0-95665 | 0-96437 | 0-97198 | 0-97948 
16 | 0-86134 | 0-87007 | 0-87867 | 0-88711 | 0-89542 | 0-90360 | 0-91164 | 0-91956 | 0-92735 | 0-93502 
17 | -81687 | -82583 | -83464 | -g4329 | -85180 | -86017| -86841 | -87651 | -88449 | -89234 
18 | -77398 | -78315 | -79217| -80103| -80975 | -81832 | -82675 | -83504 | -84320 | -85123 
19 | -73246 | -74186 | -75109| -76016 | -76908 | -77785 | -78647| -79496 | -80330 | -81152 
20 | -69217| -70179| -71124| -72053| -72965| -73862| -74744| -75611| -76465 | -77304 
21 | 0-65297 | 0-66282 | 0-67249 | 0-68199 | 0-69133 | 0-70050 | 0-70952 | 0-71838 | 0-72710 | 0-73568 
22 | -61476 | -62484 | -63473 | -64445 | -65399 | -66337 | -67259| -68165 | -69056 | -69932 
23 | -57742| -58773 | -59785 | -60779 | -61755 | -62714 | -63656 | -64581 | -65492 | -66387 
24 | -54088 | -55143 | -56178 | -57193 | -58191 | -59171 | -60133 | -61079 | -62009 | -62923 
25 | -50504 | -51583 | -52641 | -53680| -54700| -55701 | -56684 | -57650 | -58600 | -59533 
26 | 0-46985 | 0-48088 | 0-49170 | 0-50232 | 0-51274 | 0:52297 | 0-53301 | 0-54288 | 0-55258 | 0-56210 
27 | -43522 | -44651 | -45757| -46842 | -47907| -48952 -49979 | -50986 | -51976 | -52949 
28 | -40111 | -41265 | -42397| -43506 | -44594 | -45662 | -46710 | -47739 | -48750 | -49743 
29 | -36747 | -37927| -39084/ -40218| -41330 | -42421| -43491 | -44542 | -45574 | -46587 
30 | -33423 | -34630 | -35813 | -36972| -38108 | -39223 | -40316| -41389 | -42443 | -43477 
31 | 0-30136 | 0-31371 | 0°32580 | 0-33765 | 0-34926 | 0-36065 | 0-37182 | 0-38278 | 0-39353 | 0-40409 
32 | -26881 | -28144 | -29381 | -30592 | -31779 | -32943| -34084 | -35203 | -36300 | -37378 
33 | -23655 | -24947 | -26212| -27450| -28664 | -29852 | -31018| -32161 | -33281 | -34381 
34 | -20453 | -21775 | -23069 | -24335 | -25576 | -26790 | -27981 | -29148 | -30292 | -31415 
35 | -17272| -18625| -19949| -21244| -22512| -23753| -24970| -26162| -27330| -28476 
36 | 0-14108 | 0-15493 | 0-16848 | 0-18172 | 0-19469 | 0-20738 | 0-21981 | 0-23199 | 0-24392 | 0-25562 
37 | -10959 | -12377| -13763 | -15118 | -16444 | -17741 | -19012 | -20256| -21475 | -22669 
38 | -07820 | -09272 | -10691 | -12078| -13434| -14761 | -16059| -17330| -18576 | -19796 
39 | -04689 | -06177 | -07629 | -09049 | -10436 | -11793 | -13121 | -14420| -15692 | -16938 
40 | -01562 | -03087 | -04575 | -06028 | -07448 | -08836 | -10193 | -11521| -12821 | -14094 
41 — | 0-00000 | 0-01524 | 0-03013 | 0-04466 | 0-05886 | 0-07275 | 0-08633 | 0-09961 | 0-11262 
42 oe on — -00000 | 0-01488 | -02942 | -04362 | -05751| -07110| -08439 
43 seep i _ _ — 00000 | -01454 | -02874 | -04263 | -05622 
44 ne sei ea nse sans - aie 00000 | -01421 | -02810 
45 oe ms =~ eet “ és ae on = -00000 























II 


Biom. 48 





162 


Table 1 (cont.) 





n 
AS 90 91 92 93 94 95 96 97 98 99 
k 





2-46970 | 2-47370 | 2.47764 | 2-48154 | 2-48540 | 2-48920 | 2-49297 | 2-49669 | 2-50036 | 2-50400 
2-10554 | 2-11004 | 2-11448 | 2-11887 | 2-12321 | 2-12749 | 2-13172 | 2-13590 | 2-14003 | 2-14411 
1-90052 | 1-90536 | 1-91015 | 1-91487 | 1-91953 | 1-92414 | 1-92869 | 1-93318 | 1-93763 | 1-94201 
1-75329 | 1-75842 | 1-76348 | 1-76848 | 1-77341 | 1-:77828 | 1-78309 | 1-78784 | 1-79254 | 1-79718 
1-63641 | 1-64178 | 1-64709 | 1-65232 | 1-65749 | 1-66259 | 1-66763 | 1-67261 | 1-67752 | 1-68238 


1-53836 | 1-54396 | 1-54949 | 1-55494 | 1-56033 | 1-56564 | 1-57089 | 1-57607 | 1-58118 | 1-58624 
1-45321 | 1-45903 | 1-46476 | 1-47042 | 1-47600 | 1-48151 | 1-48695 | 1-49232 | 1-49762 | 1-50286 
1-37747 | 1-38348 | 1-38941 | 1-39526 | 1-40103 | 1-40673 | 1-41235 | 1-41790 | 1-42338 | 1-42879 
1-30891 | 1-31511 | 1-32123 | 1-32726 | 1-33321 | 1-33909 | 1-34489 | 1-35061 | 1-35626 | 1-36183 
1-24600 | 1-25239 | 1-25869 | 1-26491 | 1-27104 | 1-27708 | 1-28305 | 1-28894 | 1-29475 | 1-30049 


— 
SCVOONo WRN 


11 1-18769 | 1-19426 | 1-20073 | 1-20712 | 1-21342 | 1-21964 | 1-22577 | 1-23182 | 1-23779 | 1-24368 
12 | 1-13316 | 1-13990 | 1-14656 | 1-15311 | 1-15958 | 1-16596 | 1-17226 | 1-17847 | 1-18459 | 1-19064 
13 | 1-08181 | 1-08873 | 1-09555 | 1-10228 | 1-10891 | 1-11546 | 1-12191 | 1-12827 | 1-13455 | 1-14075 
14 | 1-:03316 | 1-04026 | 1-04726 | 1-05415 | 1-06095 | 1-06765 | 1-07426 | 1-08078 | 1-08721 | 1-09356 
15 | 0-98686 | 0-99413 | 1-00129 | 1-00835 | 1-01531 | 1-02217 | 1-02894 | 1-03561 | 1-04219 | 1-04868 


16 | 0-94258 | 0-95002 | 0-95735 | 0-96458 | 0-97170 | 0-97872 | 0-98564 | 0-99246 | 0-99919 | 1-00583 
17 -90007 -90769 91519 *92258 -92986 -93704 -94411 *95109 ‘95797 | 0-96475 
18 *85914 -86693 -87460 “88215 *88959 *89693 90416 -91129 -91831 *92524 
19 *81960 *82756 *83540 *84312 *85072 *85822 *86560 *87288 *88006 *88713 
20 ‘78131 *78944 *79745 80533 *81310 *82075 *82829 *83572 +84305 *85027 


21 =| 0-74412 | 0-75243 | 0-76061 | 0-76866 | 0-77659 | 0-78441 | 0-79210 | 0-79968 | 0-80716 | 0-81452 
22 *70795 -71643 *72478 -73300 ‘74110 *74907 *75692 *76466 *77228 -17980 
23 -67267 68134 -68986 -69825 -70651 ‘71464 | -72266 *73055 *73832 *74598 
24 *63822 -64706 *65576 *66432 *67275 *68105 *68922 *69727 *70519 *71301 
25 -60451 *61353 *62241 *63115 *63974 -64821 *65654 *66474 -67282 -68079 


26 | 0-57147 | 0-58068 | 0-58974 | 0-59865 | 0-60742 | 0-61605 | 0-62454 | 0-63291 | 0-64115 | 0-64926 
27 -53905 -54845 -55769 56678 *57572 *58452 -59318 -60170 -61010 -61837 
28 -50718 -51677 -52620 *53547 -54459 55356 -56239 -57108 -57963 -58805 
29 -47582 -48561 -49522 -50468 -51398 -52312 -53212 -54097 +54969 -55827 
30 -44493 -45491 +46472 47436 -48384 -49316 -50233 51136 +52024 52898 


31 | 0-41445 | 0-42463 | 0:-43464 | 0-44447 | 0-45414 | 0-46364 | 0-47299 | 0-48218 | 0-49123 | 0-50013 
32 +38436 *39474 -40495 -41498 -42483 *43452 -44404 *45341 -46263 -47170 
33 *35461 *36520 *37561 *38584 +39588 -40576 *41547 -42501 -43440 -44364 
34 *32517 *33598 -34660 *35702 *36727 *37733 +38722 +39695 -40652 -41593 
35 -29601 +30704 *31787 *32850 +33895 *34921 *35929 *36920 *37895 *38853 


36 | 9-26710 | 0-27835 | 0-28940 | 0-30025 | 0-31090 | 0-32136 | 0-33163 | 0-34173 | 0-35166 | 0-36142 
37 -23841 +24990 +26117 *27223 -28309 *29375 +30423 *31452 *32464 *33458 
38 -20991 -22164 +23314 *24443 *25550 *26637 *27705 *28754 *29785 *30797 
39 *18159 +19356 *20530 -21681 +22810 23919 *25008 *26077 *27127 *28159 
40 *15341 *16563 -17761 18936 +20088 *21219 22328 *23418 *24488 *25539 


41 | 0-12536 | 0-13783 | 0-15006 | 0-16205 | 0-17380 | 0-18533 | 0-19665 | 0-20776 | 0-21866 | 0-22937 
42 -09740 11014 +12262 -13486 +14685 *15861 ‘17015 -18148 +19259 +20351 
43 06952 -08253 09528 *10777 -12001 -13201 +14378 *15533 -16666 -17778 
44 04169 -05499 06801 -08076 09325 *10550 *11750 -12928 *14083 *15217 
45 -01389 02748 -04078 05381 06656 -07906 09131 *10332 *11510 +12666 


46 —_— 0:00000 | 0-01359 | 0-02689 | 0-03992 | 0-05267 | 0-06518 | 0-07743 | 0-08944 | 0-10123 
— -00000 01330 02633 -03909 05159 -06385 -07586 





48 aa — _~ _ ar 00000 | -01303 | -02579 | -03829 | -05055 
49 on om ma a ae nm on 00000 | -01276 | -02527 
50 — — _ 


_ —_ _ wie pee a -00000 















































Cor @mow Nw eV ww wwe a 


cow ONAN 


Table 1 (cont.) 


163 


















































n 
100 125 150 175 200 225 250 300 350 400 

k 

1 | 2-50759 | 2-58634 | 2-64925 | 2-70148 | 2-74604 | 2-78485 | 2.81918 | 2.87777 | 2-92651 | 2-96818 
2 | 2-14814 | 2-23630 | 2-30638 | 2-36434 | 2-41365 | 2-45649 | 2-49431 | 2-55867 | 2-61207 | 2-65761 
3 | 1-94635 | 2.04090 | 2-11578 | 2-17755 | 2-22999 | 2-27547 | 2-31555 | 2-38365 | 2-44004 | 2-48806 
4 | 1-80176 | 1-90146 | 1-98019 | 2-04500 | 2-09991 | 2-14746 | 2-18932 | 2-26033 | 2-31904 | 2-36897 
5 | 1-68718 | 1-79137 | 1-87341 | 1-94081 | 1-99783 | 2-04713 | 2-09050 | 2-16397 | 2-22462 | 2-27615 
6 | 1-59123 | 1-69947 | 1-78448 | 1-85419 | 1-91308 | 1-96395 | 2.00864 | 2-08427 | 2-14663 | 2-19955 
7 | 1-50803 | 1-62002 | 1-70777 | 1-77959 | 1-84019 | 1-89247 | 1-93837 | 2-01595 | 2-07985 | 2-13402 
8 | 1-43414 | 1-54966 | 1-63997 | 1-71376 | 1-77594 | 1-82953 | 1-87654 | 1-95592 | 2-02122 | 2-07654 
9 | 1-36734 | 1-48623 | 1-57896 | 1-65462 | 1-71828 | 1-77310 | 1-82115 | 1-90220 | 1-96882 | 2-02521 
10 | 1-30615 | 1-42828 | 1-52333 | 1-60075 | 1-66583 | 1-72182 | 1-77084 | 1-85348 | 1-92133 | 1-97871 
11 | 1-24950 | 1-37477 | 1-47206 | 1-55118 | 1-61760 | 1-67470 | 1-72466 | 1-80879 | 1-87781 | 1-93614 
12 | 1-19661 | 1-32493 | 1-42438 | 1-50514 | 1-57287 | 1-63103 | 1-68189 | 1-76746 | 1-83758 | 1-89681 
13 | 1-14687 | 1-27819 | 1-37975 | 1-46210 | 1-53109 | 1-59027 | 1-64199 | 1-72894 | 1-80013 | 1-8602) 
14 | 1-09982 | 1-23409 | 1-33771 | 1-42161 | 1-49182 | 1-55200 | 1-60455 | 1-69283 | 1-76504 | 1-82595 
15 | 1-05509 | 1-19226 | 1-29791 | 1-38333 | 1-45472 | 1-51588 | 1-56923 | 1-65880 | 1-73201 | 1-79371 
16 | 1-01238 | 1-15243 | 1-26007 | 1-34697 | 1-41953 | 1-48163 | 1-53577 | 1-62659 | 1-70076 | 1-76324 
17 | 0-97145 | 1-11435 | 1-22396 | 1-31232 | 1-38602 | 1-44904 | 1-50395 | 1-59599 | 1-67109 | 1-73432 
18 *93208 | 1-07783 | 1-18937 | 1-27917 | 1-35399 | 1-41792 | 1-47359 | 1-56681 | 1-64283 | 1-70678 
19 “89411 | 1-04268 | 1-15616 | 1-24738 | 1-32330 | 1-38812 | 1-44452 | 1-53891 | 1-61582 | 1-68048 
20 *85739 | 1-00879 | 1-12417 | 1-21680 | 1-29381 | 1-35950 | 1-41663 | 1-51216 | 1-58994 | 1-65530 
21 | 0-82179 | 0-97601 | 1-09330 | 1-18731 | 1-26540 | 1-33195 | 1-38980 | 1-48645 | 1-56508 | 1-63112 
22 *78720 *94426 | 1-:06344 | 1-15883 | 1-23798 | 1-30539 | 1-36393 | 1-46169 | 1-54116 | 1-60786 
23 *75353 *91342 | 1-03449 | 1-13126 | 1-21146 | 1-27971 | 1-33895 | 1-43780 | 1-51809 | 1-58544 
24 *72070 *88344 | 1-00639 | 1-10452 | 1-18577 | 1-25485 | 1-31478 | 1-41470 | 1-49580 | 1-56379 
25 *68863 *85423 | 0-97907 | 1-07855 | 1-16084 | 1-23074 | 1-29135 | 1-39233 | 1-47423 | 1-54285 
26 | 0-65725 | 0-82573 | 0-95245 | 1-05329 | 1-13661 | 1-20733 | 1-26861 | 1-37063 | 1-45332 | 1-52257 
27 *62651 -79789 *92650 | 1-02868 | 1-11303 | 1-18457 | 1-24651 | 1-34957 | 1-43303 | 1-50289 
28 -59635 ‘77065 -90115 | 1-00469 | 1-09005 | 1-16240 | 1-22500 | 1-32908 | 1-41332 | 1-48378 
29 -56672 -74398 -87638 | 0-98125 | 1-06763 | 1-14079 | 1-20405 | 1-30914 | 1-39414 | 1-46520 
30 53758 *71782 *85212 *95835 | 1-04574 | 1-11970 | 1-18361 | 1-28971 | 1-37546 | 1-44711 
31 | 0-50890 | 0-69215 | 0-82836 | 0-93594 | 1-02434 | 1-09909 | 1-16365 | 1-27076 | 1-35725 | 1-42948 
32 -48062 -66692 *80506 *91399 | 1-00340 | 1-07895 | 1-14415 | 1-25225 | 1-33947 | 1-41228 
33 -45273 *64212 *78219 *89247 | 0-98290 | 1-05923 | 1-12507 | 1-23415 | 1-32211 | 1-39550 
34 *42518 -61770 “75973 *87135 -96279 | 1-03992 | 1-10640 | 1-21646 | 1-30515 | 1-37910 
35 +39796 59365 *73764 *85062 -94307 | 1-02098 | 1-08810 | 1-19914 | 1-28854 | 1-36306 
36 | 0-37102 | 0-56993 | 0-71590 | 0-83025 | 0-92371 | 1-00241 | 1-07016 | 1-18217 | 1-27229 | 1-34736 
37 +34436 -54653 -69450 *81022 -90469 | 0-98418 | 1-05256 | 1-16553 | 1-25637 | 1-33199 
38 *31793 -52343 *67341 *79051 *88599 -96626 | 1-03528 | 1-14921 | 1-24076 | 1-31693 
39 +29173 -50061 -65261 “77110 *86760 -94866 | 1-01830 | 1-13320 | 1-22544 | 1-30216 
40 *26572 -47804 *63210 “75197 *84950 -93134 | 1-00161 | 1-11746 | 1-21041 | 1-28767 
41 | 0-23990 | 0-45571 | 0-61185 | 0-73312 | 0-83167 | 0-91429 | 0-98520 | 1-10200 | 1-19565 | 1-27344 
42 +21423 -43361 -59184 *71453 *81410 “89751 -96905 | 1-08680 | 1-18114 | 1-25947 
43 -18870 -41172 -57208 -69618 -79678 -88098 -95314 | 1-07185 | 1-16688 | 1-24574 
44 +16330 +39002 -55253 -67806 -77969 -86469 -93748 | 1-05713 | 1-15285 | 1-23225 
45 *13800 -36851 -53319 -66016 -76283 *84862 -92204 | 1-04264 | 1-13904 | 1-21897 
46 | 0-11279 | 0-34717 | 6-51405 | 0-64247 | 0-74619 | 0-83277 | 0-90682 | 1-02836 | 1-12545 | 1-20590 
47 -08765 *32598 -49509 -62498 *72975 “81712 -89180 | 1-01429 | 1-11207 | 1-19304 
48 -06257 *30494 -47632 -60768 71350 -80168 -87699 | 1-00042 | 1-09888 | 1-18037 
49 03753 -28403 -45770 -59056 -69744 *78642 -86236 | 0-98674 | 1-08587 | 1-16789 
50 01251 +26325 *43925 57361 -68156 *77134 *84792 -97324 | 1-07305 | 1-15559 














Table 1 (cont.) 











| 








| 

125 150 175 200 | 225 | 250 300 350 400 
0-24258 | 0-42094 | 0-55682 | 0-66585 | 0-75644 | 0-83365 | 0-95991 | 1-06041 | 1-14346 
*22201 -40278 -54019 -65030 -74170 “81955 -94676 | 1-04793 | 1-13149 
*20154 *38475 52371 -63490 *72712 *80561 -93376 | 1-03561 | 1-11969 
*18115 -36684 -50737 -61966 -71270 -79183 -92093 | 1-02345 | 1-10804 
-16084 *34904 -49116 -60456 -69842 ‘77819 -90824 | 1-01144 | 1-09654 
0-14059 | 0-33136 | 0-47508 | 0-58959 | 0-68428 | 0-76470 | 0-89570 | 0-99957 | 1-08518 
+12040 *31378 -45913 -57476 -67028 *75135 *88329 -98784 | 1-07396 
-10026 -29630 -44329 -56005 65641 *73812 *87102 -97624 | 1-06287 
-08016 *27891 -42756 54546 -64267 *72503 *85888 -96478 | 1-05192 
-06009 -26160 -41193 -53099 -62904 -71206 *84687 -95344 | 1-04108 
0-04005 | 0-24437 | 0-39641 | 0-51663 | 0-61553 | 0-69921 | 0-83498 | 0-94222 | 1-03037 
-02002 22721 -38098 -50237 -60213 -68647 *82320 -93112 | 1-01978 
-00000 *21012 *36564 -48822 -58884 -67384 “81154 -92013 | 1-00930 
— -19309 *35039 -47416 -57566 -66132 -79998 *90925 | 0-99893 
—_ -17612 *33521 -46020 56257 -64891 “78854 *89848 -98866 
_ 0-15919 | 0-32012 | 0-44632 | 0-54958 | 0-63659 | 0-77719 | 0-88782 | 0-97850 
_— +14232 -30510 -43253 -53668 -62437 *76595 *87725 *96844 
—_ +12548 -29014 -41882 -52386 -61224 *75480 *86678 *95848 
_ -10868 *27525 -40519 51114 -60020 *74374 *85640 -94861 
—_ “09191 *26042 -39164 -49850 -58824 *73277 *84612 -93883 
— 0-07516 | 0-24565 | 0-37816 | 0-48593 | 0-57637 | 0-72189 | 0-83592 | 0-92914 
_ -05844 -23093 *36474 -47344 -56458 ‘71110 *82581 *91954 
_— -04173 -21626 +35139 -46103 -55287 -70039 *81579 -91002 
_ 02503 -20164 *33811 -44869 -54124 -68976 *80584 -90058 
00834 -18706 +32488 -43641 -52967 -67920 -79598 *89122 
_ —_ 0-17252 | 0-31171 | 0-42420 | 0-51818 | 0-66872 | 0-78619 | 0-88194 
*15802 *29859 -41205 -50676 -65831 -77648 *87274 
— —_ *14355 *28553 -39997 -49540 -64798 -76684 -86361 
_ — *12911 *27251 *38794 -48410 -63771 *75727 *85455 
—_ — *11470 *25954 -37596 47287 *62751 “74777 *84556 
_ _ 0-10031 | 0-24661 | 0-36404 | 0-46169 | 0-61738 | 0-73833 | 0-83663 
— _— -08594 +23373 +35218 -45058 -60730 -72896 *82778 
—_ — ‘07159 +22088 *34036 *43952 -59729 -71966 *81899 
_ -—- -05725 +20807 *32859 42851 58734 -71041 -81026 
_ —_— -04293 +19529 +31686 *41755 *57745 ‘70123 “80159 
_— — 0-02862 | 0-18254 | 0-30518 | 0-40665 | 0-56761 | 0-69211 | 0-79298 
_— _— ‘01431 -16983 *29354 +39579 -55783 -68304 *78443 
_ _ “00000 -15714 *28194 +38498 -54810 -67403 ‘77594 
_— — —_ -14448 +27038 +37421 -53842 -66507 -76750 
—_ —_ — -13184 *25885 *36349 -52879 -65617 *75912 
_ —_ —_ 0-11922 | 0-24736 | 0-35280 | 0-51922 | 0-64732 | 0-75079 
_ — —_ -10662 *23590 -34216 -50968 *63852 *74252 
—_ —_ — -09404 *22447 *33156 -50020 -62976 *73429 
_ — —_— -08147 *21307 -32099 -49076 -62106 -72611 
—_ _— _ “06891 20170 -31046 -48136 -61240 -71798 
_ _ — 0-05637 | 0-19035 | 0-29997 | 0-47201 | 0-60379 | 0-70990 
_ — —_ 04383 -17903 +28951 -46269 -59522 -70186 
_ —_ — -03130 -16773 +27907 +45342 -58670 -69387 
—_ —_ — ‘01878 +15645 *26867 -44419 -57822 -68593 
_ _ — -00626 *14520 +25830 -43499 -56978 -67802 






























































Table 1 (cont.) 























n \on 
225 250 300 350 400 c 350 400 
k\ 

101 | 0-13396 | 0-24796 | 0-42583 | 0-56138 | 0-67016 | 151 0-17626 | 0-31517 
102 *12274 *23764 -41670 -55302 -66234 | 152 -16900 -30860 
103 -11153 *22735 -40761 -54470 -65456 | 153 *16174 -30203 
104 -10034 -21708 +39856 -53641 “64682 | 154 *15450 *29548 
105 -08916 -20683 -38953 -52817 -63912 | 155 -14726 *28895 
106 | 0-07799 | 0-19661 | 0:38054 | 0-51996 | 0-63145 | 156 | 0-14003 | 0-28242 
107 -06683 -18641 -37158 ‘51178 -62383 | 157 -13280 *27591 
108 -05568 -17622 *36265 -50364 -61624 | 158 -12558 +26941 
109 04453 -16606 *35375 -49553 -60868 | 159 -11837 +26292 
110 -03340 *15591 *34487 *48745 -60116 | 160 “11117 +25644 
111 | 0-02226 | 0-14577 | 0-33602 | 0-47941 | 0-59367 | 161 0-10397 | 0-24998 
112 01113 +13566 +32720 -47139 -58622 | 162 -09678 *24352 
113 -00000 *12555 -31841 -46341 -57880 | 163 -08959 *23707 
114 — -11546 -30963 *45545 *57141 164 -08240 +23064 
115 — -10538 -30089 *44753 -56405 | 165 07522 *22421 
116 — 0-09531 | 0-29216 | 0-43963 | 0-55672 | 166 | 0-06805 | 0-21779 
117 a -08526 *28346 -43176 -54942 | 167 -06088 +21138 
118 — -07520 *27478 -42392 -54215 | 168 -05371 -20498 
119 as -06516 -26612 -41610 -53491 169 -04654 -19859 
120 | a -05513 +25748 -40831 -52770 | 170 -03938 -19220 
121 -—— 0-04510 | 0:24885 | 0-40054 | 0-52051 171 0-03221 | 0-18583 
122 — *03507 *24025 -39280 -51335 | 172 02505 -17946 
123 — -02505 +23167 -38508 -50622 | 173 ‘01789 -17310 
124 —- -01503 +22310 -37738 -49911 174 -01074 -16674 
125 — -00501 *21455 -36970 -49203 | 175 -00358 “16040 
126 a --= 0-20601 | 0-36205 | 0-48497 | 176 — 0-15406 
127 — == -19749 +35442 -47794 | 177 a -14772 
128 a = -18898 -34681 -47093 | 178 -= -14139 
129 — _— -18049 -33922 -46394 | 179 -—— *13507 
130 —- — *17201 -33164 -45698 | 180 —- *12875 
131 — —_ 0-16354 | 0-32409 | 0-45004 | 181 — 0-12244 
132 — cae -15508 -31656 -44312 182 oe -11613 
133 oe — -14664 -30904 -43622 | 183 — -10983 
134 = -— -13820 *30154 -42934 | 184 — -10353 
135 — oo *12978 -29406 -42248 | 185 a -09723 
136 — 0-12136 | 0-28659 | 0-41564 | 186 a 0-09094 
137 — a | +11296 -27914 -40883 | 187 — -08465 
138 — -- -10456 ‘27171 -40203 | 188 -—— ‘07837 
139 = - | °09617 -26429 -39524 | 189 oo -07209 
140 —- — -08778 *25689 -38848 | 190 = -06581 
141 —— — | 0-07940 | 0-24950 | 0-38174 | 191 == 0-05954 
142 = — | -07103! -24212| -37501 | 192 _ 05326 
143 a -— | -06266 +23475 -36830 | 193 | — “04699 
144 — — -05430 -22740 -36160 | 194 — -04072 
145 —- -—— -04594 +22006 *35492 | 195 — -03445 
146 —_ —_ 0-:03758 | 0-21274 | 0-34826 | 196 — 0-02819 
147 — — 02923 -20542 -34161 | 197 — 02192 
148 — — -02088 -19812 -33498 | 198 —- “01566 
149 — 01252 -19082 -32836 | 199 — 00939 














00417 








“18354 | 





-32176 











00313 














































Biometrika (1961), 48, 1 and 2, p. 167 
Printed in Great Britain 


A distribution analogous to the Borel—Tanner 


By FRANK A. HAIGHT 
Institute of Transportation and Traffic Engineering, University 
of California, Los Angeles 
1. INTRODUCTION 
Leta > 0 and r be a positive integer. Then 


az-r 
P(x5 1,0) = A(x,r) eer (vx =r,r+l1,...) 
is a probability distribution when 
r (2x—r—1 
A(a,r) =2( sone ). 


A proof of this fact for r = 1 will be found in a paper to be published shortly by the present 
writer (Haight, 1961); a slightly modified proof is valid for general r. In the theory of queues 
p(x;r,«) represents the probability that exactly z members of a queue will be served before 
the queue first vanishes, beginning with r members, and with a equal to the traffic intensity, 
assuming Poisson arrivals and negative exponential service periods. If the service periods 
are of constant length, the resulting distribution is that due to Borel (1942) and Tanner 
(1953). Also, if r = 1, A(z, 1) gives the number of ways brackets can be introduced into a 
product of x factors. 
For any r, the generating function of A(z, 1) is 


O10) = BAendet = (ay) 


and therefore the generating function of the probability distribution is 


0) = (") 6 (GS) = lasers mand - 


From this we find the mean = << , 








which agrees with that of the Borel-Tanner distribution, and 


variance = ra te 
~  (l-a)8 


which exceeds that of the Borel-Tanner distribution by a factor of (l+a). If a> 1, 
¢(1) = a, and therefore p(oo;r,~) = 1—a~. 


2. TABLES 
In the table which follows, we give values of this distribution, correct to four places of 
' decimals, in cumulative form, i.e. 


P(x;1r,a) = y P(Y;7, &) 








168 Frank A. Hatcur 


for the parameter values r = 1, a = 0-01 (0-01) 0-62. In every case, the values of x taken 
are large enough to ensure that P(x; r,a) > 0-999. The upper limit of 0-62 for « was taken 
because this was the limit of the table of cumulative sums for the Borel-Tanner distribution 
published in a recent issue of this journal (Haight & Breuer, 1960). The function has not 
‘settled down’ at this point and it is recognized that in practice it is likely that values of « 
just less than unity will occur. Plans are being made for additional computations in con- 
nexion with both functions, which will also include the case r > 1. 


REFERENCES 


Bore., E. (1942). Sur l’emploi du théoréme de Bernoulli pour faciliter le calcul d’un infinité de 
coefficients. Application au probléme de l’attente & un guichet. C.R. Acad. Sci., Paris, 214, 452-6. 

Hareut, Frank A. (1961). Expected utility for queues servicing messages with exponentially decaying 
utility (to appear). 

Hatcut, Frank A. & Breuer, MELvIn ALLEN (1960). The Borel-Tanner distribution. Biometrika, 
47, 143-50. 

Tanner, J. C. (1953). A problem of interference between two queues. Biometrika, 40, 58-69. 








_-” -— 


we wT PY 








-~ who = Od = on wror-~ ono -= wor Ll 8 


who = 


P(x; 1, a) 
a= 0-01 


0-99010 
-99980 


a = 0:02 


0-98039 
-99923 


a = 0-03 


0-97087 
-99833 
-99988 


a = 0-04 


0-96154 
-99710 
-99973 


a= 0-05 


0-95238 
*99557 
-99949 


a = 0:06 


0-94340 
-99378 
-99916 


a = 0-07 


0-93458 
*99172 
*99871 
-99978 


a= 0-08 


0-92593 
*98944 
*99815 
*99964 


a= 0-09 


0-91743 
*98693 
-99746 
-99945 


A distribution analogous to the Borel-Tanner 


yr wd = 


orf Wh = 


ar, WD = 


Table 1. Cumulative probabilities (r = 1) 


P(zx; 1, a) 
a = 0-10 


0-90909 
*98422 
-99664 
-99921 


e= 0-11 


0-90090 
*98133 
*99569 
-99890 
-99970 


a = 0-12 


0-89286 
-97827 
-99461 
*99852 
-99956 


a= 013 


0-88496 
-97506 
-99341 
-99808 
-99941 


a= 0-14 


0-87719 
-97169 
*99205 
-99753 
-99918 


a= 0-15 


0-86957 
-96820 
*99057 
-99691 
*99892 


0-99961 


a= 016 


0-86207 
*96458 
*98896 
-99621 
*99862 


x 


P(x; 1, a) 


a« = 0-16 (cont.) 


6 


a ar Wh = 


of whd = 


1m 


0-99948 


a= 017 


0-85470 
*96085 
*98721 
*99540 
*99825 


0-99931 


a= 018 


0-84746 
-95701 
*98533 
99448 
-99779 


0-99907 


a= 019 


0-84034 
*95309 
*98334 
*99349 
-99730 


0-99883 
*99948 


a = 0-20 


0-83333 
*94907 
-98122 
*99238 
*99672 


0-99853 
*99932 


a = 0-21 


0-82645 
*94499 
-97899 
99118 
-99608 


0-99819 
-99914 


x 


ao-1m art, Wh = onan or wWhN = 


eo ork WO b= 


of Who = 


P(x; 1, a) 
a = 0-22 


0-81967 
*94083 
*97665 
-98998 
*99536 


0-99779 
*99892 
-99946 


a = 0:23 


0-81301 
-93661 
*97419 
*98847 
“99455 


0-99732 
*99865 
-99930 


a = 0-24 


0-80645 
93233 
‘97163 
-98696 
-99366 


0-99680 
*99834 
-99906 


a = 0-25 


0-80000 
-92800 
-96896 
“98534 
*99268 


0-99620 
-99797 
-99889 
-99928 


a = 0-26 


0-79365 
92363 
-96620 
*98363 
-99162 


x 


169 


P(x; 1, a) 


a = 0-26 (cont.) 


6 
7 
8 
9 


oor S of Wb = 


0-99555 
*99757 
-99864 
*99917 


a = 0-27 


0-78740 
‘91921 
“96334 
-98181 
-99047 


9-99482 
-99711 
*99835 
“99904 


a= 0-28 


0-78125 
*91477 
-96041 
*97991 
*98924 


0-99402 
-99659 
-99802 
-99883 
-99930 


a = 0-29 


0-77519 
-91028 
*95737 
‘97788 
-98789 


0-99312 
*99599 
-99761 
“99855 
-99911 


a = 0°30 


0-76923 
*90578 
*95426 
*97577 
98646 





170 Frank A. Haicut 


Table 1 (cont.) 


we  P(z;1,¢@) a P(x;1,¢a) ze P(x; 1, a) z P(x; 1, a) x P(x;1,a) 
a« = 0-30 (cont.) a = 0°33 (cont.) a = 0°36 (cont.) a = 0°39 (cont.) a = 0-42 (cont.) 
6 0-99215 1l —-0-99873 13 —-0-99892 1l 099645 3 091202 
7 99533 12 -99915 14 -99923 12 -99741 4 94384 
. -99716 13 -99809 5 -96240 
9 99824 14 -99858 
10 -99889 <= oe @ = 037 15 -99894 6  0-97400 
1 0-74627 1 0-72993 7 *98159 
1l —-0-99929 2 “88758 2 87382 16 099920 8 -98673 
3 *94109 3 -93055 9 -99030 
4 -96642 4 -95851 a = 0-40 10 -99283 
oe 5 97985 5 97394 1 0-71429 
a es 1l 0-99465 
9 .90125 6 0-98748 6 0-98307 2 86006 12 -99598 
. an 7° -99202 7 ~—--98872 [. = 13 -99696 
aon 8 99481 8 99234 + SS 14 —--99769 
5 .98495 9 -99657 9 -99472 5 96727 15 -99823 
10 99771 10 "99632 6 097789 
16 0-99864 
6 099112 11 = 0-99741 7 "98470 17 -99895 
7 99462 11 099845 12 -99816 8 98922 - ake 
8 -99668 12 *99894 13 -99868 9 -99229 
9 -99792 13 *99927 14 -99905 10 99442 
10 -99868 — a = 0-43 
ll 0-99915 diaintatd a = 0:38 “ poco : Bros 
1 074074 ‘ ‘ 
2 88299 5 + 14 —--99834 3 90819 
a = 0-32 3  —--93763 : = 15 —--99876 4 94070 
4 -96386 3 "92693 5 *95984 
1 —0-75758 : ania 4 95572 16 099907 
2 89671 5 -97180 6 0-97192 
3 -94781 $ 
i = 0-41 7 97990 
4 = -97127 Soe 6 0-98142 " 8 98585 
5 *98333 8 99405 | -98746 1 0-70922 9 -98917 
6  0-98998 9 “99601 9 -99398 3 91581 
7 99382 >. ee 10 99575 4  — -94691 11 —_0-99389 
8 -99611 : 5 *96487 12 99535 
: =. 2 aaa = oom 13 -99644 
-99839 1 “99871 12 99782 6 097598 
10 99839 3 14 -99726 
: ata 13 -99842 7 —-+98318 15 —--99788 
ll 0-99895 14 “99885 8 ‘98801 
12 “99931 a = 0:36 15 -99916 - ress 16 0-99835 
1 onan : = 
a = 0-33 2 -87841 a = 0-39 1l 099532 
3 -93412 12 99652 
L = -0-75188 4 96123 1 071068 13 —--99739 a = 0-44 
2 89215 5 97600 S = B64 14 _—--99803 
3 94449 3 92326 15 99851 1 0-69444 
4 -96890 6  0-98463 4 95284 2 “84179 
5 -98165 7 -98991 5 96956 16 0-99887 3 -90432 
8 99325 17 -99913 4 *93749 
a te a 
: 10 -99685 ‘ is i 
8 99551 8 —_--99031 t= oe 6 096975 
9 -99709 11  0-99781 9 99314 1 070423 7 -97812 
10 -99809 12 -99847 10 -99509 2 ‘85091 8 -98389 








ce 


- comer WwWowWl Sr Ww 


Oa wo 





A distribution analogous to the Borel-Tanner 171 


Table 1 (cont.) 


z P(2x;1,¢a) z P(z;1,a) 2 P(2z;1, a) a P(x;1,a) a P(x;1, a) 
a = 0°44 (cont.) a = 0-46 (cont.) a = 0°48 (cont.) a = 0°50 (cont.) a = 0°51 (cont.) 
9 0-98797 12 0-99315 11 0-98914 6 095517 24 0-99859 
10 -99091 13 -99460 12 -99136 7 96577 25 -99880 
14 99572 13 -99308 8 97342 
2 — 15 -99659 14 99443 9 -97909 7 ee 

1 “99467 15 “9954 10 . " 
18 -90587 16 099727 — 98337 
14 -99678 17 “99781 16 0-99634 11 0-98665 a = 0-52 
15 99748 18 "99823 17 -99702 12 -98921 
19 *99857 18 -99756 13 -99122 1 0-65789 
ao ——_—  lhUe ee hm! | 6 
; , 20 99836 15 -99410 +87 
18 -99875 21 0-99905 4 -91011 
19 -99900 21 «099865 16 0-99513 5 -93374 
a = 0-47 22 -99889 17 99596 
23 -99909 18 -99664 6  0-94970 
a= 0-45 1 0-68027 19 -99719 7 -96099 
1 0-68966 2 "82823 20 .99765 8 -96925 
2 -83727 3 89259 a = 0-49 9 97544 
3 -90045 4 92759 og enaieee 21 ~—-0-99803 10 -98018 
4  —--93426 5 = -94890 : 22 = -99834 
5 95452 SSR 23 ~—«--99860 11 098387 
6 096281 3 -88466 24 poones 12 98677 
6 0-96753 ‘ pei 4 92074 25 —--99899 13 -98908 
7 ~~ -97628 “9790 5 94304 14 —--99094 
8 98237 9 “98390 26 ~=—-:0-99914 15 99245 
° pros 10 -98750 6 095780 
7 “96804 16 —0-99368 
10 98987 11 =0-99021 8 97539 os a - froinned 
11 099221 13 -99386 10 00ees 1  0-66225 
14 -99630 = 0-98794 4 -91369 
15 -99708 16 0-99675 12 -99033 5 .93690 21 0-99726 
17 -99735 13 -99220 22 -99766 
16 0-99769 18 -99783 14 -99367 6 0:95247 23 -99800 
17 “99816 19 -99821 15 "99484 : 96342 24 “99829 
18 -99853 20 99852 8 97138 25 -99853 
19 90888 16 = 0-99577 “ a 
20 -99905 21 0-99877 17 *99652 , 98182 26 0-99874 
22 -99897 18 -99713 0 : 27 -99891 
" 12 -98804 
L 668408 a = 0-48 21 —-0-99836 13 -99020 a = 053 
2 -83274 29 .99863 14 -99193 
3 89653 1  0-67568 q 15 -99332 1  0-65359 
23 99886 : 90187 
4 -93095 2 -82375 24 99905 : 
5 95175 3 -88865 16 0-99445 3 -86858 
4 -92420 17 -99537 4 -90651 
6 096521 5 -94602 a = 0°50 18 -99612 5 -93055 
7 97434 19 -99674 
8 -98074 6 0-96036 1 0-66667 20 -99725 6 0-94688 
9 98535 7 97024 2 -81482 7 -95850 
10 -98873 8 *97727 3 -88066 21 099768 8 *96705 
9 -98241 4 91724 22 -99804 9 -97350 
ll -0-99125 10 -98624 5 -94000 23 -99834 10 97847 





172 Frank A. HAIGutT 


Table 1 (cont.) 





a P(x;1,¢a) z P(e;1,a) a2. P(x;1, a) a P(«#;1,¢@) a P(x;1,a) 
a = 0°53 (cont.) a = 0°54 (cont.) a = 0°56 (cont.) a = 0°57 (cont.) a = 0°58 (cont.) 
ll 098236 26 —-0-99817 6 093801 16  0-98880 24 =: 099555 
12 98544 27 -99840 7 -95057 17 -99031 25 -99605 
13 -98791 28 -99860 s 95996 18 -99159 
14 -98991 29 -99877 9 -96717 19 -99268 26 099649 
15 99154 30 —--99892 10 —--97281 20 ~—--99361 23 8088 

28 -99721 
16 0-99288 31 0-99905 ll 0-97729 21 0:99441 29 *99751 
17 -99398 12 -98090 22 -99510 30 -99777 
18 -99490 a = 0-55 13 -98384 23 -99570 ; 
19 +9566 6 oe ee BS 
" ay : groom 4 = ™ or 33 -99838 
21 ~—«0-99684 3 -86048 16  -0-98993 26 = 0-99706 34 "99854 
29 99729 4 -89918 17 99133 27 -99740 35 99868 
23 -99767 5 -92399 18 -99251 28 *99770 36 0-99880 
24 99800 6 0-94103 19 99351 20 "99796 37 99891 
25 -99828 4 95329 20 -99436 30 “99819 38 -99901 
26 —-0-99852 : pone 21 —-0-99509 31 —0-99839 
27 -99872 22 99572 32 "99857 a = 0:59 
A 10 -97479 33 -99873 
29 -99904 11097907 24 -99673 34 "99887 9 T7671 
12 -98250 25 -99713 35 -99899 3 84422 
13 -98528 ; ‘ 
a = 054 a aan Ue oo CU pci-e 

1 0-64935 15 —--98942 2... Se 

° 79720 28 99806 a = 0-58 6  0-92860 

: enabe 16 0-99098 29 -99829 7 94201 

4 90285 17 -99228 30 -99849 1 0-63291 8 -95218 

pe 18 -99337 2 17995 9 -96009 

5 88738 19 99429 31 —0-99867 3 pen 10. —«-96687 

: 20 -99507 32 -99883 4 *88797 
ieee 33 -99897 5 91379 11 —0-97148 
} pa 21 =: 0-99573 34 -99910 12 “97557 
: 22 -99629 6  0-93179 13 -97899 
9 ‘97146 23 -99677 7 94493 14 .98184 
10 97665 
i 24 -99719 a = 0-57 8 "95485 15 -98423 
25 99755 9 96253 
. eign 10 — -96860 16 — 0-98625 
12 -98398 26 099786 2 -78423 17 -98797 
13 98660 27 -99813 3 *85235 11 0-97347 18 -98944 
14 98873 28 -99836 4 89173 12 97743 19 -99071 
15 99048 29 -99856 5 91723 13 -98069 20 -99180 
30 -99873 14 -98339 
16 0-99192 6 0-93492 F 21 0-99275 
15 98565 
17 -99312 31 0-99888 7 -94778 22 -99358 
18 99412 32 -99901 . 95744 16 098755 23 -99430 
19 -99496 9 -96489 17 -98916 24 -99493 
20 -99567 an on 10 -97075 18 -99053 25 -99548 
19 -99170 
21  0-99627 1 0-64103 1l 0-97543 20 99271 26 —0-99597 
22 -99678 2 -78854 12 -97922 27 -99640 
23 -99721 3 -85643 13 -98232 21 0-99358 28 -99679 
24 -99758 4 “89548 14 -98488 22 -99433 29 -99713 
25 -99790 5 -92064 15 -98701 23 -99498 30 -99743 








om FP sO Wworeeoenww nso wrevl" nm ewe we 


on @BPwooaenu 


ww eo 


A distribution analogous to the Borel—Tanner 173 


Table 1 (cont.) 


 FPtx; 1, a) 2 Ple;.i,ia) zx P(x;1,a) x  P(x;1,a) 2 Plz; 1,2) 
a = 0°59 (cont.) a = 0°60 (cont.) a = 0-61 (cont.) a = 0°61 (cont.) a = 0°62 (cont.) 
31 0-99769 20 ~=0-99084 6  0-92206 36 ~=—-: 099798 19 0-98711 
32 -99793 7 -93599 37 -99816 20 .98846 
33 -99814 21 0-99187 8 94664 38 -99832 
34 -99833 22 *99277 9 95499 39 -99847 21 ~—s-: 0-98965 
35 -99850 23 "99356 10 -96167 40 -99860 22 -99070 

24 99425 23 .99162 

36 ©=—s-_-0-99865 25 -99486 11 0-96710 41 0-99872 24 99244 

37 -99878 12 97158 42 -99883 25 .99317 
38 -99890 26 =: 099540 13 97531 43 -99893 

39 -99901 27 “99588 14 97844 44 99902 26 =: 0-993 82 

‘ pe 15 -98109 27 99440 

oouw 30 -99701 16 0-98335 a = 0-62 os pers 

2 -77149 31: 0-99731 18 -98697 1 = 061728 

3 -84016 32 ‘99758 19 -98842 2 “76311 31  —«0-99617 

4 -88040 33 ‘99782 20 -98969 3 *83201 32 99651 

35 -99822 21 0-99080 5 -89962 34 99708 

7 -93906 36 0-99839 23 -99262 6 0-91870 

8 -94948 37 -99855 24 -99337 7 -93286 36 0-99755 

9 -95762 38 -99869 25 -99404 8 *94373 37 99775 
10 -96411 39 -99882 9 *95229 38 -99793 
ll 0-96936 27 99516 40 .99825 
12 -97367 41 0-99903 28 -99563 ll 0-96478 
13 97724 29 -99605 12 96942 41 0-99839 
14 -98023 aw 30 99642 13 -97330 42 -99852 
15 -98275 14 97657 43 -99864 

1 0-62112 31 0-99675 15 -97935 44 .99875 

16 0-98489 2 76729 32 99705 45 -99885 
17 98672 3 83609 33 99732 16 0-98173 
18 -98830 4 87656 34 99756 17 -98378 46 0-99894 
19 98966 5 -90323 35 -99778 18 98556 47 99902 




















Biometrika (1961), 48, 1 and 2, p. 175 
Printed in Great Britain 


Occupancy probability distribution critical points 


By W. L. NICHOLSONt+ 


Operations Research and Synthesis Operation 
Hanford Laboratories Operation 


1. IntTRODUCTION 


Many statistical problems in the field of hypothesis testing involve a null hypothesis 
probability law for the random independent allocation of N identical balls in K equally 
likely cells. The alternatives to the equally likely null hypothesis are unequal cell probabilities 
and/or uniform systematic allocation. When the number of balls N is reasonably large, say 
N > 5K, the classical y* goodness-of-fit test criterion can be used to decide among the null 
hypothesis and the two alternatives, a small yx? value favouring the systematic alternative 
and a large y* value favouring the unequal cell probabilities alternative. If N is small, and 
if K > 2, neither x? nor the well-tabulated binomial distribution is applicable as a test 
criterion. 

Stevens (1937) and later David (1950) suggest as a test criterion the number X of ‘occupied’ 
cells, or equivalently the number K — X of ‘empty’ cells. Too many empty cells, a signi- 
ficantly small value of X, favours an unequal cell probabilities alternative, while too few 
empty cells, a significantly large value of X, favours a systematic smooth allocation of the 
balls within the K cells. For N = K, David indicates that the test criterion X has good power 
against both alternatives. As a result of numerical checks she suggests the use of the exact 
probability distribution of X for N = K < 20, a normal approximation for 20 < N < 30 
with K = 25 in all cases, and for N > 30 a y? test based on expected cell frequencies of at 
least 5. At the cross-over point of NV = 30 she shows with an example that the occupancy 
criterion has approximately the same power as that of the y? goodness-of-fit criterion for a 
test concerning the standard deviation of a normal distribution with known mean. For the 
case of N = K, David tables the probability distribution of K — X and gives a short table of 
suggested critical points. 

In the goodness-of-fit type problems considered by David a completely specified 
probability distribution is partitioned into a K-cell histogram with each cell having 
probability 1/K unde: the null hypothesis. The value of K can always be selected equal to the 
sample size NV. An example is provided by Thomas (1951) in a null hypothesis stating that 
K Poisson distributions all have the same but unknown mean where, in general, it is not 
possible to choose K = N. Here, the conditional joint distribution of the K Poisson variables 
given their sum, say NV, is multinomial with equally likely cells. Thomas is only concerned 
with large sample theory and uses appropriate normal distribution tests. There are, however, 
many applications of this technique (for example, in the field of low level radioactive 
counting) where a small sample size prohibits the use of the normal distribution. It is 
primarily for these cases that David’s tables have been extended. 


+ General Electric Company, Richland, Washington. Work performed under Contract No. AT(45-1). 
1350 for the U.S. Atomic Energy Commission. 








W. L. NicHoLson 


2. DESCRIPTION OF THE TABLE 


The occupancy probability distribution function H(X; K, N), the probability of having 
exactly X occupied cells when N balls are randomly and independently distributed among 
K equally probable cells, can be shown to be (see David, 1950 or Feller, 1950) 


K\ = _(X\ (X-j\* 
H(X; K,N -( ) 1 i( \ Fe) 
= (x) B- (5) (FE 
A table of values of the occupancy probability distribution function H(X; K, NV) for all 
meaningful combinations of N = 1(1)30, K = 2(1) 20, and X = 1(1) K has been calculated 
on IBM 650 of the Hanford Atomic Products Operation using the H-function recursion 


formula, 

H(X; K,N) = hii 5 a 1; K, N- 1+ W(X; K,N-1). 
The recursion formula is a probabilistic statement of the fact that (thinking of the balls as 
being dropped sequentially into the K cells) X occupied cells after the Nth ball is dropped 
implies either X —1 occupied cells after the (N —1)st ball is dropped with the Nth ball 
falling in an empty cell, or X occupied cells after the (N — 1)st ball is dropped with the Nth 
ball falling into an occupied cell. The boundary conditions for the recursion relationship are 


H(1; K,N) = K-", H(K; K,N) = K*-\(N-1)!/(N—K)}. 


The table, which is a direct extension of David’s for K = N = 3(1) 20, appears in Nicholson 
(1960), copies of which are available from the author. 

To facilitate the performing of hypothesis tests using the occupancy criterion X, critical 
points corresponding to upper and lower tail areas of 0-01, 0-025, 0-05 and 0-10 were com- 
puted as appropriate sums of the entries in the above-mentioned table and appear below in 
tabular form. For each K, X and lower tail area a the smallest value of NV, say N,, which 


satisfies x 
a, = 5 Hj; K,N) <a, 
j=1 


is tabled in bold type along with the corresponding «,, set in smaller type. For any N < N, 
the tail area at and below X is > «. A result of < X occupied cells would not be significant 
at the 100 % level, and there would be no reason to reject the null hypothesis in favour of an 
unequal cell probability alternative. For N > N, the tail area at and below X is < a. 
A result of < X occupied cells would therefore be significant at the 100a % level, and 
rejection of the null hypothesis in favour of an unequal cell probability alternative would be 
in order. 

Similarly, foreach K, X and upper tail area « the largest value of NV, say N,, which satisfies 


K 
= > A(j; K,N) <a, 
j=X 


is tabled along with the corresponding «,. For any N > N, the tail area at and above X is 
> a,soaresult of > X occupied cells would not be significant at the 100a % level, and there 
would be no reason to reject the null hypothesis in favour of a systematic smooth allocation 
alternative. For N < N, the tail area at and above X is < a, so a result of > X occupied 
cells would be significant at the 100« % level, and rejection of the null hypothesis in favour 
of the smooth allocation alternative would be in order. 








wy 


eZ) 


ng 
ng 


all 
ed 

















Occupancy probability distribution critical points 177 


For each K the lower tail table lists all X values for which the 10 % level of significance 
N, entry is < 30. For each K the upper tail table starts with the smallest X value for which 
a 10 % level of significance N, entry exists and lists all subsequent X values for which the 
1 % level of significance N, entry is < 30. Dash marks indicate the non-existence of N, 
entries. In both lower and upper tail tables the entry > 30 is used in lieu of the exact value 
which is beyond the range of calculation from the occupancy probability distribution table 
used to compute the critical points. 

To illustrate the use of the table of critical points consider a test based on K = 10 cells 
which results in X = 6 occupied cells. The minimum number of balls which allows rejection 
of the null hypothesis in favour of an unequal cell probabilities alternative is N, = 15 at the 
10 % level of significance (the exact probability of not more than 6 occupied cells when 15 
balls are distributed randomly and independently in 10 cells is ~, = 0-070), N, = 16 at the 
5 % level of significance, N, = 18 at the 2-5 % level of significance, and N, = 20 at the 1% 
level of significance. With K = 10 cells and X = 6 occupied cells rejection of the null 
hypothesis in favour of a systematic smooth alternative is not possible at any significance 
level < 10%. In order to get significance more cells must be occupied. For example, with 
K = 10 cells and X = 8 occupied cells the maximum number of balls which allows rejection 
of the null hypothesis in favour of a systematic smooth alternative is N, = 9 at the 10 % level 
of significance and N, = 8 at the 2-5 % level of significance. With K = 10 rejection of the 
null hypothesis in favour of the systematic alternative at the 1% level of significance 
demands at least X = 9 cells occupied, specifically, with X = 9 necessarily N = N, = 9, and 
with X = 10 necessarily 10 < N < N, = 12. 


The author is indebted to several members of the Hanford Atomic Products Operation for 
assistance in the calculation of the tables. W. F. Stevenson wrote the IBM 650 program and 
compiled the machine output for the occupancy probability distribution values. P. E. 
Leaverton checked the machine calculations and computed the critical points listed in the 
table below. Operating personnel of the Hanford Atomic Products Electronic Data 
Processing Operation handled the technical details of the IBM calculation. Appreciation is 
also expressed to E. 8. Pearson for several suggestions which simplify the presentation of , 
the critical points table. 


REFERENCES 


Davin, F. N. (1950). Two combinatorial tests of whether a sample has come from a given population. 
Biometrika, 37, 97-110. 

Freier, W. (1950). Probability Theory and Its Applications. New York: John Wiley and Sons, Inc. 

Nicnotson, W. L. (1960). Occupancy Probability Tables Based on the Multinomial Distribution for 
Equally Probable Events. Report HW-57502 REV. 

Stevens, W. L. (1937). Significance of grouping. Ann. Eugen., Lond., 8, 57-69. 

Tuomas, M. (1951). Some tests for randomness in plant populations. Biometrika, 38, 102-11. 







































































178 Table 1. Critical points for occupancy probability distributions 
Lower tail tests. Upper tail tests. 
Alternative: unequal cell probabilities Alternative: systematic allocation 
| 
Lower tail probability level « Upper tail probability level « 
| 
X| 001 0-025 0-05 010 |X} 001 0-025 0-05 0-10 
| 
| Ni a N, ay N, a | Ny o% Nn &, Nz as Ne 2) Nz a, 
| = 
es 1 8 -008 7 016 6 031 | 5 -062+ tr Mle deed a me ee ee 
|\K=3 1 6 -004 5 012 4 037 | 4 -037 a te et i al _ — — = 
| 2 15 -007 12 023 11 -035 9 -078 
K=4 1 5 -004 4 016 4 -016 3 ot] 4|— — — ae _ — 4 -094 
| 2 10 -006 8 -023 7 046 6 -092 
3 21 -010- 18 023 16 040 | 13 -094 K=12 
| =5 1 4 -008 4 -008 3 -040 3 -040 5|—- — _ — 5 -038 5 -038 
Z 8 -007 7 016 6 -040 5 -098 
| 3 14 -008 12 .021 11 -035 9 -096 
4 28 -010- 24 -024 21 -050-| 18 -089 
| 
K=6 1 4 -005 4 -005 3 -028 3 -028 oh —_ — —_—_— — 5 -093 
| 2 7 -007 6 -020 6 -020 5 -059 ei — oe 6 -015 6 -015 7-054 
3 11 -010- 10 -019 9 -037 8 -071 
| 4 18 -010- 16 022 14 049 | 13 -072 K=13 
5|>30 —|>30 — 27 -044 | 23 -089 
K=7 1 4 -003 4 -003 3 -020 3 020 1 ics sale 6 -043 6 -043 
2 7 -003 6 011 5 -038 5 -038 7| 7 -006 8 -024 8 -024 9 058 
3 10 007 9 -016 8 -036 7 -080 
4 15 -008 13 -023 12 -038 | 11 -065 
5 23 = -009 20 024 18 046 | 16 -0ss 
6|}>30 —}|}>30 — |>30 — | 28 -092 
K=8 1 4 -002 3 016 3 -016 a. eo! ei — —s one — = 6 077 
3 6 -007 6 -007 5 06 | 4 0817|— — 7 -019 7 -019 8 067 || |\K=14 
3 9 -008 8 -020 7 050-| 7 050-| 8| 8 -002 9 011 10 -028 12 -093 
4 13 -008 12 015 11 -030 | 10 -056 
5 19 -007 17 017 15 -042 | 13  -100- 
6 28 -009 25 -020 22 -046 | 20 -080 
=9 1 4 -001 3 012 3 -012 3 012 | te wet! (seis 7° -038 7° -038 
2 6 -004 5 -018 5 018 | 4 078 | 8 008 8 -008 9 035 10-083 | 
3 9 -004 8 -012 7 033 6 090 | 9| 10 -005 11 -013 12 -029 14 081 | 
4 12 -007 11 015 10 -031 | 9 -065 
5 16 -009 15 -016 13 -048 | 12 -081 | 
6| 23 -007 20 022 18 -048 | 16 099 || K=15 
7|}>30 — 29 -023 26 -048 | 23 097 
|K=10| 1 3 -010 3 010 3 010 | 2 10/ 7) — — ee al eee: = 7 060 | 
| 2 6 -003 5 -014+ 5 014+} 4 064] 8/— — 8 -018 8 -018 9 069 | 
3 8 -007 7 022 7 022 | 6 068 | 9} 9 -004 10 017 11 044 12 ‘087 || 
4 11. -008 10 018 9 041 | 8 093 |10)12 -006 13-014 15 -046 16 070 
| 5 15 -007 13 -024 12 -045 | 11 -082 ij 
| 6| 20 -008 18 018 16 045 | 15 -070 
| 7 27 007 24 -020 22 039 | 20 -074 | 
| 8|}>30 — |>30 — |>30 — | 27 -0% 
| | 
‘K=11| 1 3 -008 3 -008 3 -008 2 -091 W—- — — — —_ — 7 oss || |K=16 
| 2 6 -002 5 -010+ 5 o10t| 4 03 | 8}— —| — — 8 -031 8 +031, 
| Random independent allocation of N balls into K cells Random independent allocation of N balls into | 
| so that < X cells are occupied. For given K and X and K cellsso that > X cells are occupied. For given || 
| any N > N, the tail area at and below X is < . K and Xandany N < N, the tail area at and above | 
X is < @. 
2 
is. 


























Lower tail tests. 
Alternative: unequal cell probabilities 


Table 1 (cont.) 


Upper tail tests. 
Alternative: systematic allocation 





179 





















































Lower tail probability level a Upper tail probability level « 
X 0:01 0-025 0-05 0-10 X| 001 0-025 0:05 0-10 
Ni ay Ni ey Ni a | Ny % Nz N2 Xe Nz Xe N: as 

K=11] 3 8 -004 7 -016 7 -016 6 052 | 9/| 9 -008 9 -008 10-036 11 -089 
4 11 -004 10 -011 9 -027 8 073 | 10/11 -008 12 -023 13 -048 14. -086 
5 14-006 13-013 12 -026 | 11 -052 | 11|14 -007 15 -014 17 -040 19 -084 
6| . 18 -007 16 -021 15 -036 | 14 -062 
7 23 = -009 21 -020 19 -045 | 17 -097 
8i>30 — 28 -019 25 -046 | 23 -081 

K=12| 1 3 -007 3 -007 3 -007 2 -083 oe) = Le 8 -046 8 -046 
2 5 -008 5 -008 4 -045 4 -045 {(— 9 016 9 016 10-062 
3 8 -003 7 -o11 6 -041 6 041 | 10/10 -004 11 -018 12 -049 13. -099 
4 10-007 9 019 9 -019 8 -050*}| 11 | 12 -004 13-011 15 048 16 -082 
5 13-007 12 ic 11 -034 | 10 -072 | 12| 16 -007 18-023 19 -035 22 -094 
6 17-006 15 -020 14 036 | 13 -065 
7 21 = -008 19 -020 18 -033 | 16 -081 
8 27 = -007 24 -022 22 -046 | 20 -091 
9i>93 —i>sS — 29 -042 | 26 -089 

K=13! 1 3 -006 3 -006 3 -006 2 077 2. —. = i= —_ — 8 -064 
y 2 5 -006 5 -006 4 -039 4 -039 9|— — 9 -025- 9 025 10 -092 
3 7 -008 7 -008 6 -033 6 -033 |10| 10 -008 10-008 11 -034 12 -085 
4 10-004 9 -013 8 -038 8 -038 |11|12 -o09 12 -009 13-027 14 -058 
5 12 -009 11-022 11 022 | 10 -052 | 12| 14 -006 15 -014 17 -048 18 -077 
6 16 -005 14 -022 13 -042 | 12 -o80 | 13 | 18 -007 20 -020 22 -044 24 -079 
7 19 -009 18 016 16 -048 | 15 -080 
8 24 -008 22 = -020 20 -047 | 19 -070 
9 30 -009 28 -018 25 -048 | 23 -090 
10;}>30 — |>30 — |>30 — | 29 -o10- 

K=14} 1 3 -005 3 -005 3 -005 2 -071 ei a —_—  — — — 8 -082 
2 5 -005 5 -005 4 -034 4 e761 O)—) — — — 9 035 9 -035 
3 7 -006 7° -006 6 -027 6 <7 110) — = 10-013 10-013 11 053 
4 9 -009 9 -009 8 -029 7 086 |} 11/11 -004 12 -018 13 -049 14 -100- 
5 12-006 11 016 10 -038 9 090 | 12/13 -004 14 -014 15 -032 16 -062 
6 15 -006 14 -013 13-028 | 12 -057 | 13 | 16 -007 17 015 19 -046 20 = -070 
7 18 -009 17 -016 16 -629 | 14 -090 | 14/20 -007 22 -018 24 -037 27 = -084 
8 22 = -009 20 -024 19 039 | 17 -097 
9 27 -009 25-021 23 -044 | 21 -091 
10|/>30 —|>30 — 29 -039 | 26 -090 

K=15 1 3 -004 3 -004 3 -004 2 -067 9j—_ — — — 9 047 9 -047 
2 5 -004 5 -004 4 029 4 09 |/10;— — 10 019 10 019 11 -076 
3 7 -005 6 -022 6 -022 5 094 }11/11 -006 11 -006 12 -029 13 077 
4 9 -007 8 -023 8 023 7 072 |12| 13 -009 13 -009 14 027 15 -060 
5 12 -004 11-011 10 -029 9 072 | 13} 15 -007 16 018 17 -036 18 -063 
6 14  -008 13-019 12 -041 | 11 -086 | 14/18 -008 19 -015 21 = -043 23 -090 
7 17 -009 16 -018 15 -034 | 14 -063 | 15/23 -o10- 25-022 27 = -042 30 -o8s 
8 21 = -007 19 -022 18 -038 | 17 -063 
9 25 -009 23 -022 22 -034 | 20 -079 
10;>30 — 28 -022 26 -044 | 24 -084 
11|]>30 — |>30 — /|>30 — | 29 -092 
1 
2 
3 
4 
5 
6 
7 
8 
9 














Lower tail tests. 
Alternative: unequal cell probabilities 


Table 1 (cont.) 


Upper tail tests. 
Alternative: systematic allocation 










































































X is < a. 


| 
| Lower tail probability level « | Upper tail probability level « 
| | 
| = 
X | 0-01 | 0-025 | 0-05 | 0-10 X} 001 0-025 0-05 010 | 
owe porsiontey ae 
va Nw % | Ny & | Ny % IN, Oy Nz Nz, & Nz os Nz %& 
| 
K=16|10| 28 009 26 021 | 24 047 | 22 097 
11} >30 — | >30 — | 29 -044 | 27 -081 
A=17 | 7 3-003 3 -003 | 3 -003 | 2 -059 9;— — — — —_ — 9 074 
2 5 003 4 -023 | 4 -023 | 4 023 |10;— — —_— = 10 -035 10-035 
3 7 -003 6 016 | G6 06 | 5 02311 — — 11 -014 11 -014 12-061 
4 9 004 8 014 | 8 014 | 7 052 [12/12 00s 13 02s-| 13 025 14-068 
| 3 11-006 10 017 | 9 047 | 9 -047 | 13 14 -008 14-008 15 -026 16 058 
6 | 13-009 12 -022 | 12 022 | 11 -0s2 | 14/| 16 -008 17 -020 18 -040 19 07 
7 16 -007 15 -ois 14 032 | 13 -066 | 15} 18 -00s 20-022 21 = -038 23-090 
| 8 19 -008 | 18 015 17 027 | 15 -090 | 16/22 -008 24 -023 25-035 28-093 
| 9} 23 006 | 21 -o18 19 -oso-| 18 -081 | 17 | 27 -o10- 30 022 30 022 |>30 — 
1/10} 27 -007 25 017 23 -040 | 21 -090 
ill); >30 — 29 021 27 043 | 25 = -085 
aa >30 — |>30 — | 30 -079 
K=18| 1 3 -003 3 -003 3 003 | 2 056 ij — — <= —_ — 9 -089 
P 5 -002 4 021 . 22) &0 1: — — — 10 -044 10-044 
3 | 7 -002 6 013 6 013 | 5 067 J11};/— — 11 020 11 -020 12 -080 
4) 9 -003 8 -012 7 044 7 044 112/12 -008 12 -008 13 -036 14 -094 | 
5 11 -004 10-013 9 039 9 -039 | 13 | 13  -003 14 014 15 -040 16-087 
6 13 -006 12 -015 11 -041 | 10 -098 | 14) 15 -004 16 -014 17 035 18-068 
uj 16 -005 14 -023 14 023 | 13 -oso+} 15/17 -004 19 023 20 -044 21 = 073 
8| 18 -009 17 018 16 036 | 15 -072 | 16} 20 -006 22 = -022 23 = -037 25° -084 
| 9 22 = -006 20 -019 19 -033 | 17 -095 |17| 24 -008 26 -021 28 -045 30-082 
10 25-009 23 024 22 038 | 20 093 118) 30 o10-| >30 — | >3C —/>30 — 
| 11 30 -007 27 +023 26 -035 | 24 -074 
12|}>30 —|>30 — 30 -035 | 28 -078 
K=19 | 1 3 -003 3 -003 3 -003 2 0533 |10|— _‘ — —_ — _-_ — 10-055 
2 5 -002 4 019 4 019 409 i/11;};— — —_ — 11 026 11 026 
3 7 -002 | 6 O11 6 011 5 061 |12;— — 12 011 13-049 13 -049 
4 8 -010- 8 -010 7 -037 7 037 113/13 -004 14 -021 14 -021 15-058 
>, 11 003 | 10 -o10+ 9 032 8 096 114/15 -007 16 -023 16 -023 17-054 
6 13-005 | 12 -013 11 -033 | 10 -083 | 15|17 -008 18 -020 19 -041 20 074 
7 15-007 | 14 017 13 -039 | 12 -o85 | 16/19 -006 20 = -013 22 = -046 23 = 073 
8 18 -006 17 -013 16 026 | 14 097 |17| 22 -007 24 022 25 036 27 «07 
9 21 = -006 19 -022 18 035 |17 -o71 | 18} 26 -008 28 019 30 040 | >30 — 
10 24 -008 22 -024 21 042 | 19 -065 
11 28 = -008 26 -020 24 -047 | 23 -081 
12;}>30 — 30 -021 28 -038 | 26 -090 | 
13|}>30 —|>30 — |>30 — | 30 -097 
K=20) 1 3 -002+ 3 -002+ 2 -050 2 080 |10|—  — _ —_ —_ — 10-066 
y 5 -002 4 017 4 017 4a7i11\|;\— — —_ — 11. -033 11. 033 
3 6 -010- 6 -010 6 -010 § os 112;|— — 12 015 12 015 13 063 
4 8 -008 8 -008 7 033 7 033 | 13/13 -006 13 -006 14 029 15 078 
5 10-008 10 -008 9 -027 8 084 |14| 14 -002 15 -o11 16 -035 17 078 
6 12 010-| 12 -o10 11 027 | 10 070 | 15| 16 -004 17 -013 18 033 19-065 
7 15 -005 | 14-013 13 031 | 12 -070 | 16/18 -004 20 -025- 21 = -047 22 078 
8 17 -004 16 -019 15 039 | 14 -078 | 17} 21 -007 22 = -015 24 -046 25 071 
| 20 008 | 19 -o15 18 029 | 16 095 | 18) 24 -007 26 -022 27 = -034 30 095 
10 23 009 | 22 -o15 20 046 | 19 -076 | 19} 28 -007 30 o18 | >30 — |>30 — 
11 27 = -007 25-019 23 047 | 22 074 
12|}>30 — 29 -017 27 039 | 25 -084 
13; >30 —/|>30 — |>30 — | 29 079 
| 
Random independent allocation of N balls into K cells Random independent allocation of N balls into 
so that < X cells are occupied. For given K and X and K cells so that > X cells are occupied. For given 
any N > N, the tail area at and below X is < a. Kand Xandany N < N, the tail area at and above 


















hr RR O43 





ed ea id 


COOUrFDBAPNOW 














Biometrika (1961), 48, 1 and 2, p. 181 
Printed in Great Britain 


Test of independence in intraclass 2 x2 tables 
By MASASHI OKAMOTO anv GORO ISHII 


Osaka University and Atomic Bomb Casualty Commission, Hiroshima, Japan 


1. INTRODUCTION 


If for each member of a sample of size n a pair (A;, A;) of characteristics A; (¢ = 1, 2, ..., k) 
is observed and if the pair (A;, A;) is indistinguishable from the reversed pair (A;,A,), the 
data can be represented by Table 1, where n,; (i <j) denotes the frequency of the pair 
(A,, A;) and n = >> n,;. 


i<j 














Table 1. Pair table Table 2. Component table 
A, A, ... A, Total Ay Bg ws By Total 
A, My, Myq vee Ny A, | 21, yg wwe May | Ny 
A, Noo < Mak A 2 | "ai — veo Mae ma 
A, Nee A, | Ne ee oh 2nep | Me 
Total | | n Total | Ny : Ng au a . | ra 


Such a situation is met with in some fields of biology, for example, in classifying several 
pairs of twins with regard to certain attributes of an individual. Another example is the 
M, N blood groups cealt with in medical research. 

If we now consider for each 7 the number of components of the pairs having the character- 
istic A;, we obtain Table 2, where we have put n,; = n,; and n; = 2n;,;+ ¥n,;. Tables 1 

j+i 


and 2 correspond one-to-one to each other and we shall refer to the former as the pair table 
and to the latter as the component table. Moreover, we shall call either of them an intraclass 
contingency table in contrast to an ordinary contingency table. 

As for an ordinary contingency table, we are often interested in the problem of independ- 
ence for our intraclass case. If we denote the probability of occurrence of the pair (A;, A;) 
by p,; and that of A; by p,, then the hypothesis of independence is represented by 


Ay: pi = Pi, Pig = 2pip; (t< J). (1) 


When as a special case the number k of characteristics A; equals two, A and A, there 
arise three types of pair, (A, A), (A, A) and (A, A); hence Tables 1 and 2 reduce to Tables 
3 and 4, respectively, either of which we call an intraclass 2 x 2 (contingency) table. Let 
Pass Pad Pra» Ps and pz be the probabilities of the pairs (A, A), (A, A), (A, A), the com- 
ponents A and A, respectively. Then the hypothesis of independence reduces to 


Hy: Paa= Py, Paz=2P4Pa, Paa= PA (2) 


which shows that the test of independence is identical with the test of fit to the binomial 
distribution B(2, p,). 








Masasut OKAMOTO AND Goro IsHII 


Table 3. Pair table Table 4. Component table 














A A Total A A Total 
A a b . A 2a b r 
A 2 c ; A b 2c 8 
Total " | n Total r 8 | 2n 








2. INTRACLASS CONTINGENCY TABLE 


The probability of observing Table 1 in a random sampling with only the total n fixed 
is given by the multinomial distribution 





P(ny, 45) = 


ii? 


II, TILT IT pee) IE Il pi}), (3) 
ii* aj: 4 


which, under the hypothesis (1) of independence, becomes 


n! LEny 
P(nj;; N;;) = Tl Nia! | I ln,,! m3! Dic L pr, (4) 
i i<j 


where n,; = n,; and n; = 2n,,+ > n,; as before. 
i+i 
Since (3) is a general term of the multinomial expansion of 
(LPE+ XD Pis)” 
i i<j 
which reduces under the hypothesis (1) to 
(Spit dd 2p,p;)" = (I ;)*", 
t <j t 


and since the probability P(n,;) of the ‘marginal’ frequencies n, (i = 1,2, ...,&) is given by 
the summation of (4) over the set of {n,;,n,;} satisfying 2n,,+ > n,; = ;, we obtain from 
j+i 


ii? 


a general term of the expansion of (> p;)?” the formula 
i 


(2n)! 
II ”,! 
i 


Dividing (4) by (5), we get the conditional probability under H, of Table 1 when the n,’s 


are given 
6 n! TI n,! 
ed 


Pom ml”) = Oa TT ml HMI mg! " 
7 i<j 





P(n;) = Il pit. (5) 





From the probability distribution (6) we can calculate easily the factorial moments up to 
the second order 





(ng) =< —, B(nf®) = a 
47" 2(2n—1)’ 7" 4(2n—1)(2n—3)’ oe 
(2) (2) (6 + J), (7) 
E(n; )=5 sit (n? we. 6. ee 
ij =i’ 4? (2n—1) (2n—3)’ 





ir 


3) 


4) 











Test of independence in intraclass 2 x 2 tables 


where n® stands for n(n —1) and n for n(n —1)(n—2)(n—3). For the statistic 
mig — Ens) 


[ 
ea: Fass . 8 
" ue E(n;;) 8) 
which may be used for testing the hypothesis (1), we have in view of (7) 


1 1 

E(x?) = 3(1 +i) k(k—1). (9) 
This corresponds to Haldane’s (1939) result E(x?) = {1+1/(n—1)}(r—1)(s—1) for the 
ordinary contingency table. The calculation for V(x?) seems to be even more tedious than 
that of Haldane’s case, which is itself considerably complicated and therefore we have not 
carried it through. On the other hand it is seen by the usual large sample theory that under 
the unconditional distribution (4) the statistic y? follows in the limit a y? distribution with 
$k(k — 1) degrees of freedom as n > oo. One of the authors (Okamoto, 1959) proved that this 
is the case also with the conditional distribution (6). 


3. CONDITIONAL TEST FOR AN INTRACLASS 2 x 2 TABLE 


We shall consider the test of the hypothesis H, of independence given by (2) for an intra- 
class 2 x 2 table represented in the form of Table 3 or 4. The x? test based on the statistic 
(8) may be used if the expected values of the frequencies a, b and c under the null hypothesis 
are all moderately large. But if the expected values are not so large we recommend an exact 
test, the intraclass analogue of Fisher’s exact test applied to the ordinary 2 x 2 contingency 
table. This test was introduced by Fisher in 1934 into the fifth edition of Statistical Methods 
for Research Workers (see also, Yates, 1934; Fisher, 1935), and has led to much discussion 
among mathematical statisticians, for instance, Barnard (1947), Pearson (1947, 1955) and 
Fisher (1955). 

As a special case of the equations (3) and (6) we get for an intraclass 2 x 2 table as shown 
in Table 4 the unconditional probability 


n\ 
P(a,b,¢) = 5 (Paa)* (Paz)? (PaaY (10) 
and the conditional probability, given the ‘marginal’ frequency r, 


n'r!s! 2° 


P(alr) = @nylalblel’ (11) 
Now we see that the null hypothesis H, is equivalent to 
Hy: A=1, 
where A= ‘PaaPaa (12) 
Paw 


gives a measure of dependence of two characteristics A and A, in that A > 1 corresponds 
to positive and A < 1 to negative dependence. 
In order to test the hypothesis Hj we shall advocate the conditional test} based on the 


+ Tocher (1950) and Lehmann & Scheffé (1955) proved that Fisher’s exact test, if appropriately 
randomized at the critical value, is the most powerful unbiased test of the hypothesis of independence 
with regard to the one-sided alternative. Our test, if randomized similarly, is also the most powerful 
unbiased test of the hypothesis (2), since the distribution (10) belongs to Lehmann & Scheffé’s ex- 
ponential family with 9 = log A as the parameter tested and # = log (p,4/pzz) 88 @ nuisance para- 
meter. One of the authors, Goro Ishii, is now carrying on a study on the power function of this test, 
which will soon be published. 





184 Masasuit OKAMOTO AND Goro IsutII 


probability distribution (11) after the fashion of Fisher and Yates. That is, if we are con- 
cerned with positive dependence, A > 1, as an alternative, we reject the null hypothesis 
whenever the frequency a satisfies a > k for a suitably chosen constant k. For the critical 
value k we can calculate the size of the test by summing the probabilities (11) over the range 
a > k. On the other hand, if our concern lies in negative dependence, then the critical 
region becomes a < k. The use of the conditional distribution (11) was suggested by Armi- 
tage & Healy (1957)+} in answer to a query concerning a practical example. 


4, TABLES FOR THE CONDITIONAL TEST 


We have constructed three tables useful in applying the conditional test as described in the 
preceding section to any numerical data of the intraclass 2 x 2 table. For the convenience 
of users we have followed the style of Finney (1948) and Latscha (1953) for the table of 
critical values and levels of significance and also that of Pearson (1947) for the order of 
accuracy of the normal approximation, both presented for an ordinary 2 x 2 contingency 
table. 

We tabulate the critical values corresponding to the frequency a of the pair (A, A) and 
the levels of significance (or the sizes of the test) for either of two one-sided alternatives, 
positive and negative dependence, for the range r = 4(1)s, s = 4(1)30{ of the marginal 
frequencies r and s and for the nominal levels of significance 0-10, 0-05, 0-01 and 0-001. 
Table 5 is available for the alternative of positive and Table 6 of negative dependence. 
Numbers in boldface show critical values and those in small type the corresponding exact 
levels of significance. When s is larger than 30 and our tables are of no use, a normal approxi- 
mation is recommended. It is proved by Okamoto (1959) that if n > co andr = 2p +o0(,/n), 
0 < p < 1, then under the conditional distribution (11) the frequency a follows asymptotic- 
ally a normal distribution, N(2np?,np*q?/4), where g=1—p. Hence it holds that 
4n}(rs)-! [a —r?/(4n)] follows in the limit a standardized normal distribution. In order to 
get a better approximation it will be convenient to use Yates’ correction 

2 
Pr{a> tir} + Pra > (e122), (13) 
where Z denotes a standardized normal variate. Pr {a < k|r} is obtained by reversing the 
two inequality signs and substituting k+4 for k—}4 in (13). A better approximation is 
accomplished by standardizing a by its mean 4r(r—1)/(2n — 1) and variance 


4rs(r — 1) (s—1)/{(2n—1)? (2n—3)}, 


calculated from (7), that is, 


a (2n —1),/[2(2n—s)] 1 r(r—1) 
Pr {a > k|r} = Pr\Z > inpoe-il |t-3- se I}. (14) 


The modification required for the reversed inequality is the same as with (13). The order 


of accuracy of the approximation (14) is illustrated by Table 7 for some typical values of 
the pair (r,s). 


t The authors would like to thank the referee for drawing their attention to this work. 
t The computation was actually carried out for the wider range r = 4(1)s, s = 4(1)40. Copies of 
this extension for s > 30 will be sent to interested readers on request. 

















b 


= pose Se 





14) 


der 
3 of 


gs of 








Test of independence in intraclass 2 x 2 tables 185 


5. AN EXAMPLE 


As an illustration let us examine the datat of the S blood groups (secretor S and non- 
secretor s) in saliva for 28 pairs of male dizygotic twins in Osaka City, Japan. 


Pair table 
8 S 
(non-secretor) (secretor) 
8 5 9 r= 19 
S _— 14 a= 37 
n = 28 


The entry 5, for instance, denotes the frequency of pairs of twins of which both members 
belong to the s blood group. The marginal totals are 


r=5x2+9=19 and s=14x2+9 = 37. 


There appears a slight suspicion of positive dependence of S and s within a pair of twins. 
Since s = 37 is larger than 30 and Table 5 is not available we use the normal approxima- 
tion (14), obtaining 
Pr {a > 5|r = 19} = Pr{Z > 1-167} = 0-1216. 
It is seen from Table 7 (case r = 20, s = 40) that this figure cannot be far in error.t We 


therefore conclude that the data are insufficient for inferring the dependence in question. 
With a sample of 23 pairs of twins, showing frequencies 


8 S 
8 6 5 r= 
S —_ 12 s = 29 


it would have been possible to use Table 5. We find that for the conditional distribution, 
the probability of a > 6 is only 0-012 and we should now have fairly strong grounds for 
suspecting positive dependence of S and s within a pair of twins. 


The authors are grateful to Prof. Junjiro Ogawa, Nihon University, and the referee for 
helpful suggestions. 


REFERENCES 


ArmitaGE, P. & Heaty, M. J. R. (1957). Interpretation of x? tests. Biometrics, 13, 113-15. 

BARNARD, G. 4. (1947). Significance tests for 2x 2 tables. Biometrika, 34, 123-38. 

Finney, D. J. (1948). The Fisher—-Yates test of significance in 2 x 2 contingency tables. Biometrika, 
35, 145-56. 

FisHer, R. A. (1935). The logic of inductive inference. J. R. Statist. Soc. 98, 39-54. 

Fisuer, R. A. (1955). Statistical methods and scientific induction. J. R. Statist. Soc. B, 17, 69--78. 

Hatpang, J. B. 8. (1939). The mean and variance of y?, when used as a test of homogeneity, when 
expectations are small. Biometrika, 31, 346-55. 

LatscHa, R. (1953). Tests of significance in a 2 x 2 contingency table: extension of Finney’s table. 
Biometrika, 40, 74-86. 


{ They have been kindly lent to the authors by Prof. Mototsugu Kohama, Department of Anatomy, 
School of Medicine, Osaka University. 

t The extended table mentioned in the footnote { to p. 184, shows, in fact, that for independence 
the probability of having a > 5 with r = 19, s = 37 is more than 0-10. 





186 Masasut OKAMOTO AND Goro IsHII 


LEHMANN, E. L. & Scuerrt, H. (1955). Completeness, similar regions and unbiased estimation, IT. 
Sankhya, 15, 219-36. 

OxamorTo, M. (1959). A convergence theorem for discrete probability distributions. Ann. Inst. 
Statist. Math. 11, 107-12. 

Pearson, E. S. (1947). The choice of statistical test illustrated on the interpretation of data classed 
in a 2x2 table. Biometrika, 34, 139-67. 

Pearson, E. 8S. (1955). Statistical concepts in their relation to reality. J. R. Statist. Soc. B, 17, 
204-7. 

Tocuer, K. D. (1950). Extension of the Neyman—Pearson theory of tests to discontinuous variates. 
Biometrika, 37, 130-44. 

Yates, F. (1934). Contingency tables involving small numbers and the x? test. J. R. Statist. Soc. 
Suppl. 1, 217-35. 














0c. 
































Table 5. Significance levels of a for positive dependence 187 
(Values of a in bold type; corresponding probabilities P, in small type.) 
Probability Probability 
Me td ,iF 
0-1 0-05 0-01 0-001 0-1 0-05 0-01 0-001 
4| 4)| 2 -086 — — — 16) 6 3 -002 3 -002 3 -002 _— 
4 2 -009 2 -009 2 -009 7s 
6| 6 3 -022 3 -022 ane = 2 1 -0s59 at bad = 
4 2 -048 2 -048 os 4 
a7 | 17 6 -087 7 -008 7 -008 8 -0002 
7 7 3 -082 — = aa 15 6 -014 6 -014 7 -000 7 0004 
13 5 -025- 5 -025- 6 -001- | 6 -0008 
8 8 4 -005+ 4 -005+ 4 -005+ = 11 4 -046 4 -046 5 -002 — 
6 3 012 3 012 a pans 9 3 -092 4 -004 4 -004 = 
4 2 -030 2 -030 ae — q 3 011 3 -o11 as a 
5 2 036 2 -036 ae Ee 
9/| 9 4 -026 4 -026 nee — 
y 3 -049 3 -049 = sie 18 | 18 7 -019 7 -019 8 -001- 8 -0009 
16 6 -031 6 -031 7 -002 8 -0000 
10 | 10 4 -070 5 -001+ 5 -001+ — 14 5 -052 6 -003 6 -003 7 -0000 
8 4 -003 4 -003 4 -003 — 12 4 -090 5 -006 5 -006 6 -0001- 
6 3 -007 3 -007 3 -007 — 10 4 014 4 -014 5 -000 5 -0002 
4 2 -021 2 021 = — 8 3 -033 3 -033 4 -000 4 -0005- 
2 1 -091 — — — 6 | 2 -090 3 -002 3 -002 =: 
4| 2 -008 2 -008 2 -008 Sok 
11 } 11 5 -008 5 -008 5 -008 — 2 1 -053 aoe a =a 
9 4 -015+ 4 -015+ va =< 
7 3 -032 3 -032 = — 19 | 19 7 -039 7 -039 8 -003 9 -0001- 
5 2 077 aw == eae 17 6 -060 7 -005- 7 -005- 8 -0001+ 
15 5 -094 6 -009 6 -009 7 -0002 
12 | 12 5 -025- 5 -025- 6 -000 6 -0003 13 5 -017 5 -017 6 -000 6 -0005- 
10 4 -044 4 -044 5 -001- 5 -0007 11 4 -034 4 -034 5 -001+ — 
8 3 -082 4 -002 4 -002 —_ 9 3 -072 4 -003 4 -003 — 
6 3 -005- 3 -005- 3 -005- —_ | 3 -009 3 -009 3 -009 — 
4 2 -015+ 2 015+ =< v= 5 2 -031 2 -031 ae at 
2 1 -077 ms e wail 
20 | 20 7 -069 8 -008 8 -008 9 -0003 
13 | 13 5 -058 6 -002 6 -002 = 18 7 -012 7 -012 8 -000 8 -0005- 
11 4 -093 5 -004 5 -004 =e 16 | 6 -021 6 -021 7 -001- 7 -0010- 
9 4 -009 4 -009 4 -009 = 14 5 -037 5 -037 6 -002 7 -0000 
7 3 -022 3 022 ann — 12 4 -068 5 -004 5 -004 6 -0000 
5 2 -059 — = — 10 4 -010+ 4 -010+ 5 -000 5 -0001- 
8 3 -026 3 -026 4 -000 4 -0003 
14| 14 | 6 -008 6 -008 6 -008 7 -0001- 6 2 -076 3 -0o01+ 3 -o01+ a 
12 5 -01s+ 5 -01s+ 6 -000 6 -0002 4 2 -006 2 -006 2 -006 — 
10 | 4 -029 4 -029 5 -000 5 -0004 2 1 -048 1 -048 = me 
8 3 -059 4 -00i+ 4 -001+ — 
6 3 -003 3 -003 3 -003 a Zi | 2 8 -017 8 -017 9 -001- 9 -0010- 
4 2 -012 2 012 — — 19 7 026 7 026 8 -002 9 -0000 
2 1 -067 ae — — 17 6 -042 6 -042 7 -003 8 -0001- 
15 5 -069 6 -006 6 -006 7 -0001+ 
15 | 15 6 -022 6 -022 7 -001- 7 -0007 13 5 -012 5 -012 6 -000 6 -0003 
13 5 -037 5 -037 6 -001+ — 11 4 -026 4 -026 5 -001- 5 -0007 
11 4 -065- 5 -003 5 -003 = 9 3 -058 4 -002 4 -002 as 
9 4 -006 4 -006 4 -006 hee 7 3 -007 3 -007 3 -007 ioe 
7 3 -015+ 3 -015+ = ee 5 2 -026 2 026 — =— 
5 2 -046 2 -046 ae aes 
Zz i Ze 8 -032 8 -032 9 003 | 10 -0001- 
16} 16} 6 -048 6 -048 7 -003 8 -0000 20 | 7 -048 7 -048 8 -005s- | 9 -0002 
14 5 -074 6 005+ | 6 -005+ | 7 -0000 18 6 -074 7 -008 7 -008 8 -0003 
12 5 -010- 5 -010- 5 -010- | 6 -0001- 16 | 6 -015s- 6 -015- 7 -001- 7 -0006 
10 | 4 -020 4 -020 5 -000 5 -0002 14) 5 027 5 -027 6 -001+ | 7 -0000 
8 3 -044 3 -044 4 -001- 4 -0007 12 4 -053 5 -003 5 -003 6 -0000 







































































188 Masasut OKAMOTO AND Goro IsuHII 
Table 5 (cont.) 
Probability Probability 
on ae Ss r 
0-1 0-05 0-01 0-001 0-1 0:05 0-01 0-001 
22 | 10 4 -008 4 -008 4 -008 5 -o001- | 27 | 25 8 -099 9 018 | 10 -002 | 11 -0001- 
8 3 -021 3 -021 4 -000 4 -0002 23 8 -028 8 -028 9 -003 10 -o001+ 
6 2 -065- 3 -001- 3 -001- 3 -0010- 21 7 -043 7 -043 8 -005- 9 -0002 
4 2 -005+ 2 -005+ 2 -005+ —_ 19 6 -069 7 -009 7 -009 8 -0004 
2 1 -043 1 -043 a — 17 6 -016 6 -016 7 -001- 7 -0009 
15 5 -031 5 -031 6 -002 7 -0000 
23 | 23 8 -056 9 007 9 007 | 10 -0003 13 4 -061 5 -005- 5 -005- | 6 -0001- 
21 7 -081 8 -o11 9 .001- 9 -0006 11 4 012 4 -012 5 -000 5 -0003 
19 | 7 -018 7 -018 8 -oo1t | 9 -0000 9 3 -032 3 -032 4 001- | 4 -0009 
17 | 6 -030 6 -030 7 -002 8 -0000 7 2 -096 3 -004 3 -004 om 
15 5 -052 6 -004 6 -004 7 -0001- 5 2 -017 2 017 _ — 
13 4 -094 5 -009 5 -009 6 -0002 
11 4 -020 4 020 5 -001- 5 -ooos+ | 28 | 28 | 10 -022 | 10 -022 | 11 -002 | 12 -o001+ 
9 3 -047 3 -047 4 -002 —_ 26 9 -032 9 -032 | 10 -004 | 11 -0002 
7 3 -005+ 3 -005+ 3 -005+ — 24 8 -047 8 -047 9 -006 | 10 -0004 
| 5 | 2 -022 2 022 i a 22 | 7 -o71 8 -o10+ | 9 -oo1- | 9 -0007 
20 | 7 -018 7 -018 8 -oo1+ | 9 -0000 
24 | 24] 8 -089 9 014 | 10 -o01- | 10 -0010- 18 6 -031 6 -031 7 -003 8 -0001- 
22 8 -022 8 -022 9 -002 | 10 -0000 16 5 -055+ 6 -005s+ 6 -005+ | 7 -0002 
20 7 034 7 -034 8 -003 9 -0001- 14 5 -012 5 -012 6 -000 6 -0004 
18 6 -055- 7 005+ 7 005+ 8 -0002 12 4 -026 4 -026 5 -001+ 6 -0000 
16 5 -089 6 -010+ 7 -000 7 -0004 10 3 -063 4 -003 4 -003 5 -0000 
14 5 020 5 -020 6 -001- 6 -0009 8 3 011 3 -o11 4 -000 4 -0001+ 
12 | 4 041 4 -041 5 -002 6 -0000 6| 2 043 2 -043 3 -001- 3 -0005+ 
10 3 -089 4 -006 4 -006 5 -0000 4 2 -003 2 -003 2 -003 _— 
8 3 -017 3 -017 4 -000 4 -0002 2 1 -034 1 -034 =< — 
6 2 056 3 -001- 3 -001- 3 -0008 
4| 2 -004 2 -004 2 004 — 29 | 29 | 10 -037 | 10 -037 | 11 -oos+ | 12 -0004 
2 1 -040 1 -040 — — 27 9 052 | 10 -o08 | 10 -o08 | 11 -0006 
25 8 -075- | 9 012 | 10 -001+ | 11 -0000 
25 | 25 | 9 -026 9 026 | 10 003 | 11 -0001+ 23 8 -020 8 -020 9 -002 | 10 -0001- 
23 8 -039 8 -039 9 004 | 10 -0002 21 7 032 7 -032 8 -003 9 -0001+ 
21 7 -059 8 -007 8 -007 9 -0003 19 6 -054 7 -006 7 -006 8 -0003 
19 | 6 -090 7 012 8 -001- 8 -0007 17 5 -090 6 -012 7 -001- 7 -0006 
17 6 -022 6 -022 7 -001+ 8 -0000 15 5 -024 5 024 6 -o01+ 7 -0000 
15 5 -040 5 -040 6 -003 7 -0001- 13 4 -050+ 5 -004 5 -004 6 -0001- 
13 4 -075+ 5 -006 5 -006 6 -0001+ 11 4 -010- 4 -010- 4 -010- 5 -0002 
11 4 -015+ 4 -015+ 5 -000 5 -0004 9 3 -027 3 -027 4 -001- 4 -0007 
9 3 -039 3 -039 4 -oo1+ nes 7 2 -085- 3 -003 3 -003 — 
7 3 -004 3 -004 3 -004 ae 5 2 -015- 2 -015- ome — 
5 2 -019 2 019 ee = 3 1 -097 — _ — 
26 | 26 9 .045+ 9 045+ | 10 006 | 11 -0003 | 30 | 30! 10 -oss 11 -o10- | 11 -o10- | 12 -0009 
24 8 -065- 9 -009 9 -009 | 10 -0006 28 9 os0 | 10 -015- | 11 -oo1+ | 12 -0001- 
22 | 7 -093 8 -015- | 9 -o01+ | 10 -0000 26 | 9 -023 9 023 | 10 002 | 11 -o001+ 
20 7 024 7 -024 8 -002 9 -0001- 24 8 -035- 8 -035- 9 -004 | 10 -0002 
18 6 -041 6 -041 7 -004 8 -0001+ 22 | 7 -054 8 -007 8 -007 9 -0004 
16 5 063 6 -007 6 -007 7 -0003 20 6 -086 7 013 8 -001- 8 -0009 
14 5 -01s+ 5 -01s+ 6 -001- 6 -0006 18 6 -024 6 -024 7 -002 8 -0000 
12 4 -033 4 -033 5 -002 6 -0000 16 5 -044 5 -044 6 -004 7 -0001+ 
10 3 -075- 4 -004 4 -004 5 -0000 14 4 -08s- 5 -009 5 -009 6 -0003 
8 3 -014 3 -014 4 -000 4 -o001+ 12 4 -021 4 -021 5 -001- 5 -0009 
6 2 049 2 049 3 -001- 3 -0006 10 3 -054 4 -003 4 -003 5 -0000 
4 2 -004 2 -004 2 -004 — 8 3 -010- 3 -010- 3 -010- 4 -0001- 
2 1 -037 1 -037 pease eee 6 2 -038 2 -038 3 -000 3 -0004 
4 2 -003 2 -003 2 -003 — 
27 | 27 9 sek 10 012 | 11 -oo1- | 11 -oo10- 2 1 -032 1 -032 ~— — 
































15 


















































Test of independence in intraclass 2 x 2 tables 189 
Table 6. Significance levels of a for negative dependence 
(Values of a in bold type; corresponding probabilities P, in smal! type.) 
| 
Probability | Probability 
Ss r Ss e4 — 4 
0-1 0:05 0:01 0-001 | 0-1 0-05 0-01 0-001 
| 
6| 6! O 069 ahs ae — 20 | 20 2 -015- 2 -015- 1 -001- 1 -0007 
18 2 -063 1 -006 1 -006 0 -0001+ 
; 0 -037 0 -037 — —_— 16 1 -029 1 -029 0 -001+ ae 
14 1 -099 0 -008 0 -008 Je 
8 | 8| O -020 0 -020 a = 12 | O -033 0 -033 ae ae 
| 
9) 9} O on 0 011 = on 21 | 21 | 3 075+ | 2 -o09 2 -009 1 -0004 
7 0 090 | a = = 19 2 -042 2 -042 1 -003 0 -0001- 
| | 17| 1 018 1 018 0 -co1- | O -0008 
10 10 0 -006 | O -006 0 -006 — 15 1 -068 0 -005- 0 -005- ee | 
8 | 0 053 | == co” I — 13 | O -o21 0 -021 mar a 
11 0 -069 a = ae 
11 | 11 1 -083 0 -003 0 -003 — 
9! O -030 0030 | — — 221-29 3 -052 2 -006 2 -006 1 -0002 
| 20 | 2 -028 2 -028 1 -002 0 -0000 
miei dee | gem oem) — | fis] 2 a6 | 1am | Om | 0 am 
} ; iy Wee 16 | 1 -046 1 -046 0 -003 docs 
8 | 0 01 =r | | 14} 0 -013 0 -013 ae a= 
13} 13 | 1 -032 1 032 | © -co1- | O -0008 oT ee) ee al = 
; pod 7a) oe 5 23 | 23! 3 036 | 3 036 | 2 -oo4 | 1 -ooo1t 
Pe Mb yiee ae 21 2 -018 2 -018 1 -001+ | O -0000 
19 2 -067 1 -007 1 -007 0 -0002 
13S dS) 3] (Sel 4S) a S| ee |“ 
10! 0 034 | © -034 on’ 15 ; inc ; 008 | 0 -008 ee 
ie 13 030 030 ee a 
15/15] 1-01 | 1-01 | 0 -000 | © -0002 11 | 0 089 = aa - 
fi } bo ° i — Be 24 | 24 : 024 2 024 ; 002 1 -0001- 
9} 006 | — anti tid BG 22 083 012 ‘001- | 1 -0007 
| 20 2 -046 2 -046 1 -004 0 -0001+ 
16} 16] 2 -081 1 007 | 1 007 | 0 -oo01+ 18 | 1 020 | 1 020 | O -oo1- | O -c010- 
14 1 -040 1. -040 | 2 ohn 16 1 -066 0 -005+ 0 -005+ _— 
12| 0 -o12 0 -012 a . ; 020 | O 020 — _ 
10 0 -055+ a me as 1 061 a as een 
17|17| 2 -054 1 -004 1 -004 0 -ooo1- | 25 | 25 | 4 -094 3 -016 2 -o01+ | 1 -0000 
15 1 -025+ 1 025+ | O -001- 0 -0009 23 3 -059 2 -008 2 -008 1 -0004 
13 0 -007 0 -007 0 -007 ous 21 2 -031 2 -031 1 -003 0 -0001- 
11 0 .035- | O -035- = = 19 2 096 1 -013 : -001- 0 -0006 
17 1 -045- 1 -045- -003 as 
18 | 18 2 -035+ 2 -035+ 1 -002 0 -0000 15 | O -013 0 -013 a = 
16 1 -016 1 -016 0 -001- | O -0005+ 13 0 -041 0 -041 scsi 5 
14 1 -067 0 -004 0 -004 es 
12 | O -022 0 -022 ss — 26 | 26 | 4 -068 3 011 : -001~ : -0008 
10 | O -078 a —_ _ 24 3 -041 3 -041 -005- -0002 
22 2 -021 2 -021 1 -002 0 -0000 
19 | 19 2 -023 2 -023 1 -001+ 0 -0000 20 2 -068 1 -008 1 -008 0 -0003 
17 2 -091 1 -010- 1 -010- 0 -0003 18 1 -030 1 -030 0 -002 ee 
15 1 -044 1 -044 0 -002 — 16 1 -088 0 -008 0 -008 = 
13 | O -013 0 -013 a — 14} O 027 0 -027 ca ~~ 
11 | O ost ae —_ _ 12 | 0 -076 a aa et 






































Masasurt OKAMOTO AND Goro IsuHir 


Table 6 (cont.) 



























































Probability Probability 
ae sir 
0-1 0-05 0-01 0-001 0-1 0-05 0-01 0-001 
27 | 27 | 4 -049 4 -049 3 -007 2 -000s- | 29 | 29 | 4 -024 4 -024 3 -003 2 -0002 
25 3 -028 3 -028 2 -003 1 -0001+ 27 4 -075- 3 -013 2 -001+ 1 -0000 
23 3 -087 2 -014 1 001+ | O -0000 25 3 -045- 3 -045- | 2 -006 1 -0004 
21 2 -048 2 -048 1 -005+ 0 -0002 23 2 -023 2 -023 1 -002 0 -0001- 
19 1 -020 1 -020 0 -oo1+ — 21 2 -068 1 -009 1 -009 0 -0004 
17 1 -062 0 -005+ 0 -00s+ — 19 1 -029 1 -029 0 -002 _— 
15 | 0O -018 0 -018 —_ — 17 1 -081 0 -008 O -008 —_— 
13 0 -053 aa eae —_— 15 0 -024 0 -024 — — 
13 0 -065+ = —_ — 
28 | 28 | 4 -034 4 034 3 -004 2 0003 | 30 | 30| 5 -083 4 -016 3 -002 2 -0001- 
26 3 -019 3 -019 2 -002 1 -0001- 28 4 -054 3 -009 3 -009 2 -0007 
24 3 063 2 -009 2 -009 1 -0006 26 3 -031 3 -031 2 -004 1 -0002 
22 2 -033 2 -033 1 -003 0 -0001+ 24 3 -089 2 015+ 1 -001+ 0 -0000 
20 2 094 1 013 0 -001- 0 -0007 22 2 -048 2 -048 1 -006 0 -0002 
18 1 -043 1 -043 0 -003 — 20 1 -020 1 -020 0 -0o1+ _ 
16 0 -012 0 -012 i se 18 1 -058 0 -005- 0 -005- — 
14 0 -036 0 -036 —_ —_ 16 0 -016 0 -016 —_ —_ 
12 | O -o92 — oe — 14 | O -046 0 -046 — — 
Table 7. Normal approximation for significance levels 
Chance Normal Chance Normal Chance Normal Chance Normal 
that True approx. that True approx. that True approx. that True approx. 
a A— ae aes. to. ies. | Sui -_ ‘Y a8 “i. me 
r= 20,8 = 40 r= 30,¢= 40 r = 30,8 = 60 r = 30,8 = 100 
( 0 0-0075 0-0130 2 0-0033 0-0045 ( 0 0-:0006 0-0017 0 0-0124 0-0217 
a< 1 -0724 -0796 ‘ae 3 -0242 -0273 - 1 -0080 -O11I7 a< 1 -0871 -0934 
\ 2 -2793 -2778 + 4 -1042 -1080 ° “| 2 -0496 -0551 2 -2779 -2706 
| 5 +2890 -2907 3 +1747 -1766 
{ 5 0-1438 0-1476 8 00-2028 0-2062 7 0-1380 0-1404 5 0:2094 0-2147 
a> 6 -0300 -0311 és 9 -0631 -0661 ~ 8 -0398 -0403 * 6 -0704 -0680 
\ 7 -0033 -0036 *~ 110 -0126 -0142°*) 9 -0077 -0073 77) 7 -0167 -0142 
ll -0015 -0020 10 -0010 -0010 8 -0027 -0019 
r= 40,8 = 40 r= 4,e= @ r= 40,8 = 100 r = 80,8 = 100 
5 0:0024 0-0030 3 0:0039 0-0051 0 0-0003 0-0014 10 0-0010 0-0012 
- ‘| 6 -0153 -0170 of <| 4 -0212 -0237 i -0047 -0080 1l -0039 -0044 
7 7 +0650 -0679 om 5 -0782 -08l4a< 2 -0285 -0342 “t 12 -0132 -0141 
8 -1912 -1940 6 -2075 -2092 3 -1041 = -1082 "> 13. -0372 -0385 
4 +2602 -2576 14 -0882 -0897 
12 0-1503 0-1533 10 0-1683 0-1707 15 -1783 -1796 
at 13 -0467 -0494 ig ll -0603 -0620 8 0°1333 0-1344 
=“ \14 -0099 -0113 = )12 -0158 -0168 ae 9 -0464 -0454 20 0-2146 0-2159 
15 -0014 -0018 13. -0030 = =-0033 = 110 -0123 -0114 21 +1115 -1128 
ll -0024 -0021 a 22 -0497 -0507 
=“ )23 -0188 -0195 
24 -0060 -0064 
25 -0016 -0018 

















nan 





we 


wenNve™ 














Biometrika (1961), 48, 1 and 2, p. 191 
Printed in Great Britain 


Miscellanea 


Table of Neyman-shortest unbiased confidence intervals 
for the Poisson parametert 


By COLIN R. BLYTH anp DAVID W. HUTCHINSON 


University of Illinois, Urbana, Illinois 


1. Introduction and summary. Neyman-shortest unbiased confidence intervals for the Poisson para- 
meter can be obtained in exactly the same way as was done for the Binomial parameter by Blyth & 
Hutchinson (1960). In the Poisson case the equations for the best unbiased acceptance region of size a, 
namely 9+ 79 < X+ Y <n,+/, reduce to 


No, N, integers, O<y,<1, O<y,<1, 

_ (my —A*) {Py.(m9 < X < nm, —1)—a}—MyPy(X = ng) +2, P(X = m4) 
repaid (my — M9) Pao(X = no) P 
_ (My—A*) {Py.(M) < X < nm, —1)—a}—nyPy(X = no) +n, Py(X = m4) 
“4 (ny —%) Py(X = m4) ‘ 


The University of Illinois Digital Computer Laboratory’s ILLIAC was programmed to compute and 
print out r9+ Yo, 2, +7; for any « and any set of A* values. The values used were a = 0:95, 0-99 and 
A* = 0-005, 0-015, ..., 0535, 0-55, 0-65, ..., 74°95, 75°5, 76-5, ..., 300-5. Neyman-shortest unbiased 
confidence sets were then read off from this print-out. 


2. Use of the tables. Let X be a Poisson variable having expectation A and Y be Rectangular (0, 1) 
variable, with X and Y independent. If X =z and Y = y (using a table of random numbers) are 
observed, enter the table for « at the location x+y and read there the observed «-confidence interval 
for A. Tables are given for « = 0-95, 0-99 and for values of X + Y as follows: 


0-01 (0-01) 0-10 (0-02) 0-20 (0-05) 1-00 (0-1) 10-0 (0-2) 40-0 (0-5) 55-0 (1) 250. 


The intervals at which X + Y is given and the number of places given in the confidence limits are so 
chosen that interpolation and round-off errors are small compared to the length of the interval (less than 
2% except for a few X + Y values below 1). For large values of X the Normal approximation to the 
Poisson distribution gives the approximate «-confidence interval }{c? + 2X + c(c? + 4X)}}, where c is the 
1— 4a point of the standardized Normal (0, 1) distribution. For X > 250 the differences between these 
limits and the Neyman-shortest unbiased limits are small compared to the length of the interval (less 
than 1%). 


Example 1. If X is Poisson (expectation A) and X = 23, Y = 0-65 are observed, 0-99 confidence limits 
for A are 12-7, 37-9. For X = 23, depending on the observed Y, the lower limit will fall between 12-4 
and 13-1 and the upper limit between 37-0 and 38-3, with an average length of 25-0. The usual equal 
probability tail interval first given by Garwood (1936) and tabled in Biometrika Tables for Statisticians, 1, 
Table 40 is 12-5, 38-5 for X = 23, with length 26-0. 

Example 2. If X,,...,X9 are 20 independent Poisson variables having a common expectation A, 
then 2X, is Poisson (expectation 20A). If £X;= 78 and Y = 0-54 are observed, the observed 0-95 
confidence interval for 20A is 62, 96; the observed 0-95 confidence interval for A is therefore 3-1, 4-8. 








V1 


REFERENCES 


Biytu, C. R. & Hutcurson, D. W. (1960). Table of Neyman-shortest unbiased confidence intervals 
for the binomial parameter. Biometrika, 47, 381-91. 
Garwoopn, F. (1936). Fiducial limits for the Poisson distribution. Biometrika, 28, 437-42. 


+ Work supported by the National Science Foundation. 















192 


X+Y 
0-01 
-02 
03 
04 
05 


0-06 
07 
08 
09 
10 


0-12 
“14 
16 
18 
20 


0-25 
30 
+35 
-40 
45 


2-1 
2-2 
2°3 
2-4 
2-5 


2-6 
2-7 
2-8 
2-9 
3-0 


3-1 
3-2 
3-3 
3-4 
3-5 


Table 1. Neyman-shortest unbiased 0-95 and 0-99 confidence intervals for A based on X, 


a= 0-95 


ocooococo 
ooooo 


ccoooo cosoeosoe eosooso ososesoso esoeosoo esooseSe 
a oe) we to _ ° 
wo wo —) oo a 


eoocoococo 
or 
to 


° 


Poisson (A) and an independent Y, Rectangular (0, 1) 








a=0-99 a= 0-95 a=0-99 a=0-95 
Fa, X+Y a oi X+¥ a“ 
0 0 3-6 0-5 81 0-2 10:3 9-1 40 15:8 
0 ‘9 3-7 6 82 3 10-5 9-2 40 15-9 
0 1-3 3:8 6 8-4 3 «10-6 9-3 41 16-1 
0 1-7 3-9 ‘7 85 4 10-7 9-4 4-1 16-2 
0 1-9 4-0 8 86 4 10:8 9-5 4:2 16:3 
0 2-1 4-1 0-8 8-8 0-5 11-0 9-6 43 16-5 
0 2-3 4-2 § 9-0 5 11:3 9-7 44 16-6 
0 2-4 4-3 9 91 5 115 9-8 44 16:7 
0 2-5 4:4 10 93 ‘5 11-6 9-9 45 16:8 
0 2-7 4-5 10 95 5 11:8 10-0 46 16-9 
0 2-9 4-6 1-1 9-6 0-6 11-9 10-2 4:7 17-2 
0 3-0 4-7 1-1 9-7 6 12-1 10-4 4:8 17-5 
0 3-2 4:8 12 98 “7 12-2 10-6 5:0 17:8 
0 3-3 4-9 13 10-0 ‘7 12:3 10-8 5-1 18-0 
0 3-4 5-0 14 10-1 8 12-4 11-0 5:3 18-2 
0 3-6 5-1 14 10:3 0-8 12-6 11-2 5-4 18-5 
0 3:8 5-2 14 10-4 9 12-9 11-4 5-6 18:8 
0 4:0 5-3 15 10-6 9 13-0 11-6 5-7 19-1 
0 4-1 5-4 15 10:8 9 13-2 11-8 5-8 19:3 
0 4:3 5-5 16 10-9 10 13-4 12-0 6-0 19-5 
0 4-4 5-6 16 11-0 10 13-5 12-2 6-1 19:8 
0 4:5 5-7 1-7 11-2 1-1 13-6 12-4 63 20-1 
0 4-6 5-8 18 11:3 1-1 13-8 12-6 6-4 20:3 
0 4-7 5-9 19 11-4 12 13-9 12-8 6-6 20-6 
0 4-7 6-0 20 11-5 13 14-0 13-0 6-8 20-8 
0 4:8 6-1 20 11-7 13 14-2 13-2 6-9 21-1 
0 4-9 6-2 2-1 11:8 13 14-4 13-4 70 21-4 
0 4-9 6-3 2-1 12-0 14 14-6 13-6 7-1 21-6 
0 5-0 6-4 2-2 12-2 14 14-7 13-8 73 21:8 
0 5-1 6-5 2:2 12:3 14 149 14-0 7-5 22-1 
0 5-5 6-6 2:3 12-4 15 15-0 14-2 76 22-4 
0 5-9 6-7 2-4 126 16 15-1 14-4 7-7 22-6 
0 6-1 6-8 2-4 12-7 16 15:3 14-6 79 22-9 
0 6-4 6-9 25 12:8 1-7 15-4 14:8 8-0 23-1 
0 6-6 7-0 26 12:9 18 15-5 15-0 8-2 23-3 
0 6-7 71 2:7 13-1 18 15-7 15-2 8-3 23-6 
0 6-9 7-2 2-7 13-2 19 15:9 15-4 8-5 23-9 
0 7-0 7:3 2:8 13-4 19 16-0 15-6 8-6 24-1 
0 71 7-4 28 13-5 19 16-2 15-8 8-8 24-4 
0 7-2 75 2:9 13-7 20 16-4 16-0 9-0 24-6 
0 7-6 7-6 29 13:8 20 16:5 16-2 9-1 24-9 
0 7:8 7-7 3-0 13-9 2:1 16-6 16-4 9-2 25-1 
0 8-1 7:8 3-1 14-0 2:2 16:7 16-6 9:4 25-4 
0 8-3 7-9 32 141 2-2 16:8 16-8 9-5 25-6 
0 8-4 8-0 3-3 14:3 2:3 17-0 17-0 9-7 25:8 
0 8-6 8-1 3-3 14-4 2-4 17-2 17-2 9-9 26-1 
0 8-7 8-2 3-4 146 2-4 17:3 17-4 10:0 26-4 
0 8-8 8-3 3-4 14-7 2-4 17-5 17-6 10:2 26-6 
0-1 9-0 8-4 3-5 14-9 25 17-7 17-8 10:3 26-8 
‘1 91 8-5 3-6 15-0 25 17:8 18-0 10-5 27-1 
0-2 9-4 8-6 3-6 15-1 26 17-9 18-2 10-6 27-3 
2 96 8-7 3-7 15:3 2:7 181 18-4 10-8 27-6 
2 98 8-8 3-8 15-4 2:7 18-2 18-6 10:9 27:8 
+2 10-0 8-9 33 15-5 28 183 18-8 11-1 28-1 
+2 10-2 9-0 39 15:6 29 18-4 19-0 11-3 28:3 





a=0-99 
’ ie 


2-9 
3-0 
3-0 
3-1 
31 


3-2 
3-3 
33 
3-4 
3-5 


3-6 
3-7 
3-8 
4-0 
4-1 


4-2 
4:3 
4-4 
4-6 
4-8 


4-9 
5-0 
5-1 
5-2 
5-4 


5:5 
5-6 
5-7 
5-9 
6-0 


6-1 
6-3 
6-4 
6-5 
6-7 


6-8 
6-9 
71 
7-2 
7-4 


7-5 
7-6 
7-9 
7-9 
8-1 


8-2 
8-3 
8-4 
8-6 
8-8 


8-9 
9-0 
9-1 
9-3 
9-5 


Note; the pairs of figures under each a-heading are lower and upper confidence limits for A. 


18-6 
18-8 
18-9 
19-1 
19-2 


19-4 
19-5 
19-6 
19-7 
19-8 


20-2 
20-5 
20-8 
21-0 
21-2 


21-6 
21-9 
22-1 
22-4 
22-6 


22-9 
23-3 
23-5 
23-8 
24-0 


24-3 
24-6 
24-9 
25-1 
25:3 


25-7 
26-0 
26-2 
26-4 
26-7 


27-0 
27-3 
27-5 
27-8 
28-0 


28-3 
28-6 
28-9 
29-1 
29-3 


29-6 
29-9 
30-3 
30-4 
30-6 


30-9 
31-2 
31-5 
31-7 
31-9 











oe, a. ee oe 


an Ae ot Oe Oe 


2a ba 6& Oe 6a 


wanee@ga ewe ce S32 3° Pee FF Ve, eS 


SCHAWS 














X+Y 
30-2 
30-4 
30-6 
30°8 
31-0 


31-2 
31-4 
31-6 
31-8 
32-0 


32-2 
32-4 
32-6 
32-8 
33-0 


33-2 
33-4 
33-6 
33-8 
34-0 


34-2 
34-4 
34-6 
34-8 
35-0 


35-2 
35-4 
35-6 
35-8 
36-0 


36-2 
36-4 
36-6 
36-8 
37-0 


37-2 
37-4 
37-6 
37:8 
38-0 


38-2 
38-4 
38-6 
38:8 
39-0 


39-2 
39-4 
39-6 
39-8 
40-0 


40-5 
41-0 
41:5 
42-0 
42-5 


Table 1 (cont.) 


a = 0-95 
"a. 


20-2 
20-4 
20-5 
20-7 
20-9 


21-0 
21-2 
21-4 
21-5 
21-7 


21-9 
22-0 
22-2 
22-4 
22-5 


22-7 
22-8 
23-0 
23-2 
23-4 


23-5 
23-7 
23-8 
24-0 
24-2 


24-4 
24-5 
24-7 
24-9 
25-0 


25-2 
25-3 
25-5 
25-7 
25-9 


26-0 
26-2 
26-4 
26-5 
26-7 


26-9 
27-0 
27-2 
27-4 
27-6 


27-7 
27-9 
28-0 
28-2 
28-4 


28-8 
29-3 
29-6 
30-1 
30-5 


41-7 
42-0 
42-2 
42-4 
42-7 


42-9 
43-2 
43-4 
43-6 
43-8 


44-1] 
44-3 
44-6 
44-8 
45-0 


45:3 
45-5 
45-7 
46-0 
46-2 


46-4 
46-7 
46-9 
47-1 
47-3 


47-6 
47-8 
48-1 
48-3 
48-5 


48-8 
49-0 
49-2 
49-5 
49-7 


49-9 
50-2 
50-4 
50-6 
50-8 


51-1 
51:3 
51-6 
51:8 
52-0 


52-2 
52-5 
52-7 
52-9 
53-1 


53-8 
54:3 
54-9 
55-5 
56-1 


a=0-99 
ony 


17-7 
17-9 
18-0 
18-2 
18-4 


18-5 
18-6 
18-8 
18-9 
19-1 


19-3 
19-4 
19-6 
19-7 
19-9 


20-0 
20-2 
20-3 
20-5 
20-7 


20-8 
21-0 
21-1 
21-3 
21-5 


21-6 
21-7 
21-9 
22-1 
22-3 


22-4 
22-5 
22-7 
22-9 
23-0 


23-2 
23-3 
23-5 
23-6 
23-8 


24-0 
24-1 
24-3 
24-4 
24-6 


24-8 
24-9 
25-1 
25-2 
25-4 


25-8 
26-2 
26-6 
27-0 
27-4 


46-1 
46-4 
46-6 
46-8 
47-0 


47-3 
47-6 
47-8 
48-1 
48-3 


48-6 
48-8 
49-1 
49-3 
49-5 


49-8 
50-0 
50-3 
50-5 
50-7 


51-0 
51:3 
51-5 
51-7 
51-9 


52-2 
52-5 
52-7 
52-9 
53-2 


53-4 
53:7 
53-9 
54-2 
54-4 


54-6 
54-9 
55-1 
55-4 
55-6 


55-9 
56-1 
56-4 
56-6 
56-8 


57-1 
57-3 
57-6 
57:8 
58-0 


58-6 
59-2 
59-8 
60-4 
61-0 





X+Y 
43-0 
43-5 
44-0 
44-5 
45-0 


45-5 
46-0 
46-5 
47-0 
47-5 


48-0 
48-5 
49-0 
49-5 
50-0 


50-5 
51-0 
51-5 
52-0 
52-5 


53-0 
53-5 
54-0 
54:5 
55-0 


a = 0-95 
—_ 


31-0 
31-3 
31-8 
32-2 
32-7 


33-1 
33-5 
33-9 
34-4 
34-8 


35-2 
35-6 
36-1 
36-5 
36-9 


37-3 
37-8 
38-2 
38-7 
39-1 


39°5 
39-9 
40-4 
40°8 
41-3 


42-1 
43-0 
43-9 
44-7 
46 


46 
47 
48 
49 
50 


56-6 
57-2 
57-8 
58-4 
58-9 


59-5 
60-0 
60-6 
61-2 
61-8 


62-3 
62-9 
63-5 
64-1 
64-6 


65-2 
65-8 
66-4 
66-9 
67-5 


68-0 
68-6 
69-2 
69-8 
70-3 


71-4 
72-6 
73-7 
74:8 
76 


77 
78 
79 
80 
82 


83 
84 
85 
86 
87 


88 
89 
91 
92 
93 


94 
95 
96 
97 
98 


99 
101 
102 
103 
104 


193 
a= 0-99 
c__— 
27-8 61-6 
28-2 62-2 
28-6 62-8 
29-0 63-4 
29-4 67-0 
29-8 64-6 
30-2 65-2 
30-6 65-8 
31-1 66-4 
31-4 67-0 
31-9 67-5 
32-2 68-2 
32:7 68-7 
33-1 69-4 
33-5 69-9 
33-9 70-6 
34:3 71-1 
34-7 71-7 
35-1 72-3 
35-5 72-9 
36-0 73-5 
36-3 74-1 
36-8 74-6 
37 75 
38 76 
38 ri | 
39 78 
40 79 
41 80 
42 82 
43 83 
43 84 
44 85 
45 86 
438 87 
47 89 
48 90 
48 91 
49 92 
50 93 
51 94 
52 96 
53 97 
54 98 
54 99 
55 =6100 
56 6101 
57 102 
58 104 
69 105 
60 106 
60 107 
61 108 
62 109 
63 110 
Biom. 48 








194 


X+Y 


86 
87 
88 
89 
9 


91 
92 
93 
94 
95 


96 
97 
98 
99 
100 


101 
102 
103 
104 
105 


106 
107 
108 
109 
110 


111 
112 
113 
114 
115 


116 
117 
118 
119 
120 


121 
122 
123 
124 
125 


126 
127 
128 
129 
130 


131 
132 
133 
134 
135 


136 
137 
138 
139 
140 


a=0-95 
(oar 


69 
70 
70 
71 
72 


73 
74 
75 
76 
77 


78 
78 
79 
80 
81 


82 
83 
84 
85 
86 


87 
88 
88 
89 
90 


91 
92 
93 
94 
95 


96 
97 
98 
98 
99 


100 
101 
102 
103 
104 


105 
106 
107 
108 
108 


109 
110 
lll 
112 
113 


114 
115 
116 
117 
118 


105 
106 
107 
108 
109 


110 
112 
113 
114 
115 


116 
117 
118 
119 
120 


121 
123 
124 
125 
126 


127 
128 
129 
130 
131 


132 
134 
135 
136 
137 


138 
139 
140 
141 
142 


143 
144 
146 
147 
148 


149 
150 
151 
152 
153 


154 
155 
156 
157 
159 


160 
161 
162 
163 
164 





Table 1 (cont.) 





a= 0-99 a= 0-95 a= 0-99 a= 0-95 a=0-99 
hie: X+Y pan, i X+¥ as peony 

64 112 141 119 165 112 173 196 169 224 162 234 
65 113 142 119 166 113 (174 197 170 225 163 235 
66 114 143 120 167 114 176 198 171 226 163 236 
66 115 144 121 168 115 177 199 172 227 164 237 
67 116 145 122 169 116 178 200 173 228 165 238 
68 117 146 123 170 117 179 201 174 230 166 239 
69 118 147 124 172 117 180 202 175 231 167 240 
70 120 148 125 173 118 181 203 176 232 168 241 
71 121 149 126 174 119 182 204 177 233 169 242 
72 122 150 127 175 120 183 205 178 234 170 244 
72 123 151 128 176 121 184 206 179 235 171 245 
73 124 152 129 177 122 185 ‘| 207 180 236 172 246 
74 125 153 130 178 123. 187 208 181 237 173 247 
75 126 154 130 179 124 188 209 181 238 173 248 
76 127 155 131 180 125 189 210 182 239 174 249 
77 129 156 132 181 126 190 211 183 240 175 250 
78 130 157 133 182 126 191 212 184 241 176 251 
79 131 158 134 183 127 192 213 185 242 177° 252 
79 132 159 135 184 128 193 214 186 243 178 253 
80 133 160 136 186 129 194 215 187 245 179 254 
81 134 161 137 187 130 195 216 188 246 180 256 
82 135 162 138 188 131 196 217 189 247 181 257 
83 136 163 139 189 132 198 218 190 248 182 258 
84 138 164 140 190 133 199 219 191 249 183 259 
85 139 165 141 191 134 200 220 192 250 184 260 
86 140 166 142 192 135 201 221 193 251 184 261 
86 141 167 142 193 135 202 222 194 252 185 262 
87 142 168 143 194 136 203 223 195 253 186 263 
88 143 169 144 195 137 204 224 195 254 187 264 
89 144 170 145 196 138 205 225 196 255 188 265 
90 145 171 146 197 139 206 226 197 256 189 266 
91 147 172 147 198 140 207 227 198 257 190 268 
92 148 173 148 200 141 209 228 199 258 191 269 
93 149 174 149 201 142 210 229 200 259 192 270 
93 150 175 150 202 143 211 230 201 260 193 271 
94 151 176 151 203 144 212 231 202 262 194 272 
95 152 177 152 204 144 213 232 203 263 194 273 
96 153 178 153 205 145 214 233 204 264 195 274 
97 154 179 154 206 146 215 234 205 265 196 275 
98 156 180 154 207 147 216 235 206 266 197 276 
99 157 181 155 208 148 217 236 207 267 198 277 
100 158 182 156 209 149 218 237 208 268 199 278 
101 159 183 157 210 150 220 238 209 269 200 279 
101 160 184 158 211 151 212 239 209 270 201 281 
102. 161 185 159 | 212 152 222 240 210 271 202 282 
103 162 186 160 213 153 223 241 211 272 203 283 
104 163 187 161 215 153 225 242 212 273 204 284 
105 164 188 162 216 154 225 243 213 274 205 285 
106 166 189 163 217 155 226 244 214 275 205 286 
107 167 190 164 218 156 227 245 215 276 206 287 
108 168 191 165 219 157 228 246 216 278 207 288 
109 169 192 166 220 158 229 247 217 +279 208 289 
109 170 193 167 221 159 230 248 218 280 209 290 
110 171 194 167 222 160 232 249 219 281 210 291 
111 172 195 168 223 161 233 250 220 282 211 292 





Note: the pairs of figures under each «-heading are lower and upper confidence limits for A. 














—— 





Miscellanea 195 


Upper percentage points of a substitute F-ratio using ranges 


By K. C. SREEDHARAN PILLAI} anp ANGELES R. BUENAVENTURA{t 
The Statistical Center, University of the Philippines 


Let F’ = w,/w, where w, and w, are ranges of two independent samples of size n, and nz, respectively, 
taken from normal populations having the same standard deviation. Table 2 gives the upper & and 1% 
points of F’, a substitute F-ratio, for values of n,, n, varying from 2 to 10. This extends the work of 
Pillai (1951, 1957) who gave upper 5 and 1 % points of F” for n,, n, ranging up to 8. The present work 
has also been directed towards correcting the values of F’ which can be derived from the table of Link 
(1950) by taking reciprocals. His values for smaller sample sizes are not accurate enough for practical 
use. 

Tabulation of the percentage points was made by extending Pillai’s methods. First, his approximation 
to the distribution of the semi-range, W, as a series of gamma functions (Pillai, 1948, 1950, 1951) was 
improved by considering more terms in the expansion and the coefficients of the first few terms were 
tabulated (Buenaverura, 1959) for values of sample size from 3 to 12 instead of 3 to 8 as given by Pillai 
(1952). The distribution of the semi-range derived in this manner is given by 


S(W) = W-te-tnsow'D, (1) 
where D = CM + C0" W240 WA+..., 


and n is the size of the sample, taken from a normal population with unit standard deviation, from which 
the semi-range W was obtained. The C{" coefficients for values for n ranging from 3 to 12 are given in 
Table 1. 

Using f(W) in (1), the distribution of F” was obtained following Pillai (1951) in the form 


[oe] rT 
fF’) = 5 BD Peden ta) +7 — ORO PYF Bing + A+ (ms APA HM — r+ 1. (2) 
T= = 


Percentage points presented in Table 2 were based on series (2). The incomplete beta functions involved 
in the cumulative distribution function of F’ derivable from (2) were mostly integrated by parts in 
order to have sufficient accuracy. For n, = 2, the values in Table 2 can be computed by dividing by ./2, 
the corresponding percentage points of the studentized range (Pillai, 1957; Harter, Clemm & Guthrie, 
1959). 


The authors wish to acknowledge the facilities offered by the Statistical Center, University of the 
Philippines, in the preparation of this paper. 


REFERENCES 


BuENAVENTURA, A. R. (1959). On a Substitute F-ratio Using Ranges. M.A.S. Thesis submitted to 
the Graduate School, University of the Philippines. 

Harter, H. L., Cremm, D. 8S. & Gururig, E. H. (1959). The Probability Integral of the Range and of 
the Studentized Range. Ohio: Wright Air Development Center. Tech. Report No. 58-484, 2. 

Linx, R. F. (1950). The sampling distribution of the ratio of two ranges from independent samples. 
Ann. Math. Statist. 21, 112-6. 

Prouat, K. C. S. (1948). A note on ordered samples. Sankhyd, 8, 375-80. 

Pruuar, K. C. 8. (1950). On the distributions of mid-range and semi-range in samples from a normal 
population. Ann. Math. Statist, 21, 100-5. 

Pruxat, K. C. S. (1951). Some notes on ordered samples from a normal population. Sankhyd, 11, 23-8. 

Priiuatr, K. C. §. (1952). On the distribution of ‘Studentized’ range. Biometrika, 39, 194-5. 

Pruxat, K. C. 8. (1957). Concise Tables for Statisticians. Manila: The Statistical Center. 


+ United Nations Senior Adviser in Mathematical Statistics and Visiting Professor of Statistics. 
Now with United Nations, New York. 
{ Research Instructor. 





196 





Miscellanea 


Table 1. Values of C\™ coefficients for n = 3,4, ..., 12 


n ce” com” cm” om 
3 2-2053 15582 0-1228 72039 0-01417 55429 0-094572 57432 
4 3-0476 94522 0-2542 84571 0-05915 73835 0-071347 65185 
5 3-6249 76870 0-3627 24248 0-1195 48715 0-074726 25527 
6 3-96046 2787 0-4402 10583 0-1833 01703 0-079909 48370 
7 4-0958 17895 0-4877 08428 0-2426 00723 0-01585 23602 
8 4-0758 99468 0-5095 65175 0.2929 29359 0-02170 90301 
9 3-9421 37188 0-5110 72548 0-3321 43307 0-0269 1 69907 
10 3-7299 50454 0-4973 65581 0-3597 20048 0-03116 04640 
11 3-4681 42690 0-4729 56308 0-3761 83911 0-03430 93722 
12 3-1792 44908 0-4415 81756 0-3826 80232 0-03636 18641 
n om cy cy cP? 
3 0-042189 56371 0-0°6973 23557 0-0°1046 91042 0-082219 29693 
4 0-094576 33423 — 0-0°7798 29201 0-051562 44912 0-0°1801 46225 
5 0-071430 01981 0-0°7386 93546 0-053400 48316 0-052766 52948 
6 0-073170 05571 0-045244 90681 0-041031 01293 0-041046 34952 
7 0-075610 12308 0-0°1483 22530 0-042932 30434 0042359 09785 
8 0-078533 62668 0-082995 52913 0-046788 12458 0-044114 14903 
9 0-01168 44731 0-0°4963 15277 0-0°1335 27171 0-045984 07437 
10 0-01482 42995 0-0°7217 10268 0-052268 90958 0-047809 74046 
ll 0-01775 83304 0-0°9573 63192 0-0°3447 66736 0-049470 27794 
12 0-02034 36661 6-071186 71181 0-0°4810 41785 0-0°1090 05391 
Table 2. Percentage points of the distribution of F’ = w,/w,* 
my 
2 3 + 5 6 7 8 9 10 
ms \ L 
Upper 5 % points 
2 12-71 19-08 23-2 26-2 28-6 30-5 32-1 33-5 34-7 
3 3-19 4-37 5-13 5-72 6-16 6-53 6°85 7-12 7:33 
4 2-03 2-66 3-08 3°38 3-62 3°84 4-00 4:14 4-26 
5 1-60 2-05 2-35 2-57 2-75 2-89 3-00 3-11 3-19 
6 1-38 1-74 1-99 2°17 2-31 2-42 2-52 2-61 2-69 
7 1-24 1-57 1-77 1-92 2-04 2-13 2-21 2-28 2-34 
8 1-15 1-43 1-61 1-75 1-86 1-94 2-01 2-08 2-13 
9 1-09 1-33 1-49 1-62 1-72 1-79 1-86 1-92 1-96 
10 1-05 1-26 1-42 1-54 1-63 1-69 1-76 1-82 1-85 
Upper 1 % points 
2 63-66 95-49 116-1 131 143 153 161 168 174 
3 7°37 10-00 11-64 12-97 13-96 14-79 15-52 16-13 16-60 
4 3-73 4-79 5-50 6-01 6-44 6-80 7-09 7:31 7-51 
5 2-66 3-33 3-75 4-09 4-36 4-57 4-73 4-89 5-00 
6 2-17 2-66 2-98 3-23 3-42 3-58 3-71 3°81 3-88 
7 1-89 2-29 2-57 2-75 2-90 3-03 3-13 3°24 3°33 
8 1-70 2-05 2°27 2-44 2-55 2-67 2-76 2°84 2-91 
9 1-57 1-89 2-07 2-22 2-32 2-43 2-50 2-56 2-63 
10 1-47 1-77 1-92 2-06 2-16 2-26 2-33 2-38 2-44 


* w, denotes the larger range. For a test of equality of two population variances, the 5% and1% 
levels of this table are the 10 % and 2 % levels, respectively, of a two-tailed test. 








de 


wi 


(b 























Miscellanea 


On Durbin’s formula for the limiting generalized variance of a sample of 
consecutive observations from a moving-average process 


By A. M. WALKER 
Statistical Laboratory, University of Cambridge 


1. Let 2,,2%2,...,%, be consecutive observations from the Ath order moving-average process {x} 
defined by @ = €& + PG. +---+Prenw 


where {e;} (¢ = 0, +1, +2, ...) is a set of independent random variables having a common distribution 
with mean zero and variance o?. Let the variance-covariance matrix of 21, 2%», ...,%, be o?V,, so that if 
V, = (v3) (i,9 = is 2, bésgeeee 


h 
Vi = E(Xy%444:-5)/0? = a B, Brsje-sy 
r= 


where we define £, = 1 and f, = 0 for r > h. Let the determinant |V,,| be denoted by D,, so that the 
generalized variance of the set of observations is o?D,. Durbin (1959) has shown that for h = 1,2, 


lim D,, exists provided that the roots of the equation z* + £,z"-1+4...+, = 0 each have modulus less 
n>o 


than unity, and is equalto lim Dj, where oD;, is the generalized variance of a similar set of observations 
n—->o 
from the autoregressive process which is the stationary solution of the equation 


t+ By Xt... + Pan = & 
Durbin conjectured that the result holds for any value of h and in this note we prove that his con- 
jecture is correct. 

A proof has also recently been given by Finch (1960). He makes use of the theory of Toeplitz forms, 
and is thereby able to obtain a generalization of Durbin’s result. However, the present proof, while 
perhaps less elegant than Finch’s, is entirely self-contained, depending only on elementary properties 
of determinants and matrices. It also shows that lim D, is equal to a determinant of order h whose 





n—>o 
elements are quadratic polynomials in /,, f2, ..., 8,, and this is more easily evaluated than the expression 
for the limit given by Finch, which is a function of the roots of the equation 


2+ Byz-14+...4 28) = 0. 


2. Define £, to be zero for r < 0 as well as for r > h, and let B,, be the matrix with £;_; as the element 
in row 7 and column 7 (i, 7 = 1, 2,...,n). Then taking n > h, we have 


(BA Br) = x Bi-+Bs+ = >> Bir Bi—e = 5 


whenever i > h andj > h. Hence if U, = V, — B;, By, we can write 


U, 0 
v.=( ) 


0 h-1 
where (Oh)is = > > Bi+Bj—+ = >> BiseBise (4,5 = 1,2,...5h). 
r=—(h-1) r=0 
eos Ve VEN 70, © 
Thus V,'B,B, =1,- je st ( . 0) 
where V;' is partitioned into submatrices in the same way as U,, and J, is the nxn unit matrix. 
Therefore ile 1,-—V®U, 0 
oo oe ee 
and, as |B,| = 1, 
|V5"| = |Vn' BL B,| = |1,—ViU,|. (1) 
We now require the following lemma. 
h 
LEMMA. lim (Vz); = > b*b*, 
n->o r=1 
where B;) = (b4) (4,9 = 1,2,...,n) 


(b4 does not depend on n because B,, and therefore also B;’, is an upper triangular matrix). 


198 | Miscellanea 


Proof. Let E,, = V,—C,, where C,, = B, B;/. Then V,(V;!—C;') C, = —E,, so that 
Vi = C,'—V,'£,C,". 


Now (En) = >» Ptr >» P-Pot = > PrtBr-» 


which is zero unless |i—j| <h aa olthae i orj >n—h, and Sane unless both 7 and j > n—2h. 
Thus, writing V!# for (V;"),;, 
n 
Vi - (Cn as = >> Va Badrs (Cr) si (2) 


r,8=n—2k+ 
Let >» a,0* be the expansion of (1+ /,0+.. phegeh in powers of @ (which is certainly valid for 


\é| < 1). "Then it is easy to verify that b*’ = a,;_, for7 > i, and therefore, if we define a, to be zero for 7 < 0, 
n 
(Cais = Dy Wp 1 py. 
r=1 


i 
Hence lim (Cy")n-45 = lim DY @, 049-145 = 0. 
n—> co n—>or=0 


Now clearly sup |(2,)n—i+1,n-j11] < 00, where (i,j) is any fixed pair of positive integers, and the 


supremum is taken over all positive integral values of n. The right-hand side of (2) will therefore certainly 
tend to zero as n > 00, thus establishing the lemma, provided that wr | Vi-"-3+1| < oo. Now V, and 
therefore V;' is positive definite, so that 

(Vi ibd oT vi lean i inn = Vas n—it+1  ¢Aigiiios n-j+1 


(since V,, is not altered when the order of its rows and columns is reversed). Hence it will suffice to show 
that for any fixed i, sup Vy~*t»""**? < oo. But VR)" = (1—R? ,)-1, where R,,; is the multiple 
n 


correlation coefficient between x, _;,; and (X1,Xpg,...,Un_—isUn—ing>-++»%n) (See, for example, Cramér, 
1946, p. 308). It is easily seen, by expressing 2, 22, ...,, in terms of the ¢,, that the ‘residual’ variable 
defined by subtracting from 2x,_,,, its linear regression on the remaining 2’s has a variance which is not 
less than unity. It follows that sup R? , < 1, and therefore sup Vn-**#""**? < @, 
‘ ” . . ° Sa . 
Let a,;;= lim V# and A = (a,;) (i,j = 1,2,...,h). Thenif (a) = A-, it follows from the lemma that 


no 


h 
at = ¥ Bib» ie. A= BB. 
r=1 
Therefore from the equation (1) 
lim |V;"| = |In,—AU,| = |A||A-*-U,|. 
n—>o 
Also |A| = 1, so that finall . : 
ht tee a ee ee 
n—>@ n> 


3. We now show that lim Dj, = |B,B,—U,|-. 


n> 
First we note that Dj, is constant for n > h; this follows from the fact that when {z,} isan autoregressive 
process generated by the equation 


+ Bey t--- +Brtar = &y 
n 
x (X,+ By X41 +... + Bp %_n)?/o% +a quadratic form in 2, 22, | 
=h+1 


is the sum of squares of n uncorrelated variables with unit variances [compare Siddiqui, 1958, p. 586, 
equation (7)]. It will therefore suffice to prove that Dj, = |B,B,—U,|-1. 

Let W,, denote the variance-covariance matrix of n consecutive observations from the above auto- 
regressive process, and X;, denote the row vector (21, %, ...,%,). Then we have 


n 
X,WSX,= XW Xt (> Ba i) | (n>h) 
=h+1 
as an identity in (x, 2%, ...,%,) (Siddiqui, 1958, Prats (8)), and Siddiqui shows how this may be used 
to abtain the elements of W;" for n > 2h. His argument (which is also given by Champernowne, 1948, 
p- 206) is in fact valid for n = 2h, and using the result 


h 
oO Wan )is = X BirB s+ if i< h, j sS h, 
r= 











we « 


(cor 


so 1 

















Miscellanea 


we obtain, on putting n = 2h in the above identity, 
h 2h 
(Wa das * >» Bir Bie — x Br-aBr-s 
r=1 r=h+1 
(compare Durbin, 1959, p. 311, equation (19)). From this it is easily verified that 
o7( Wr" )is = (Br By—Uy)is, 
j-1 h+j-i 
[ to a >j, each is equal to ( ee > ) BeBres-s| 
r=0 r=h+1-1 


so that (Dj;)-! = |B, B;,—U,|. Hence we have established that lim Dj, = lim D,. 


no n>o 


REFERENCES 


CHAMPERNOWNE, D. G. (1948). Sampling theory applied to autoregressive sequences. J. R. Statist. 
Soc. B, 10, 204-31. 

Cramer, H. (1946). Mathematical Methods of Statistics. Princeton University Press. 

DurBIn, J. (1959). Efficient estimation of parameters in moving-average models. Biometrika, 46, 
306-16. 

Finca, P. D. (1960). On the covariance determinants of moving-average and autoregressive models. 
Biometrika, 47, 194-6. 

Srppiqui, M. M. (1958). On the inversion of the sample covariance matrix in a stationary auto- 
regressive process. Ann. Math. Statist. 29, 585-8. 


The central sampling moments of the mean in samples from a finite population 
(Aty’s formulae and Madow’s central limit) 


By D. E. BARTON anp F. N. DAVID+ 
University College London 








We consider a sample 2,,%2,...,2, randomly drawn without replacement from a finite population 
F = (X,, Xq,..., Xy). The polykays, as defined by Tukey (1950), of this population may be written as 
Kym... DATA* 
Aty (1954), gave the first eight central moments of the sample mean, k,, in terms of these polykays, 
and of constants r—-1 1\r-1-% 
A,= E «(-5) (rf =:1,2, ...58), 
i=1 
ya 
hi a eal el 
where a (- x) 
but did not establish the general relationship, 
r!(7—1)! 
Mg(1') -s > (7-1 aa) x : Aj:... Ap’ Kym... ama (1) 


P (py!) ... (pg!) my! ... my! 
In this expression the suffix F denotes ‘in sampling from #’, the summation is over all partitions 
(pi... Pr) of r excluding those with unit parts, A is the number of distinct parts and 
T = 7, +My+...+7, 

is the number of parts. If the {X,} are themselves a sample from an infinite population which has finite 
cumulants k,, K2,... then, since A, tends to n-"-» as N tends to infinity, the right-hand side of (1) gives 
the usual expression for the moments of k, in independent sampling, on taking expectations and letting 
N tend to infinity. The numerical coefficients are thus well tabulated. 

We have found the general validation of (1) to be necessary for theoretical purposes (Barton, David 
& Fix, 1960) in spite of the fact that in practice Aty’s results are sufficient. The general formulae sim- 
plify Madow’s (1948) central limit theorem for finite populations and slightly extend it. 


+ With the partial support of the Office of Naval Research while at the University of California, 
Berkeley. 


200 Miscellanea 


Proor or ATy’s FORMULAE 


Since M g(1’) is necessarily a polynomial, homogeneous and of weight r, in the momentsof F and since 
products of these are uniquely expressible as linear sums of the polykays of weight r (cf. Wishart, 1952), 
then M g(1’) is uniquely expressible as a linear sum of polykays of weight r. Considering #, formally, 
to be a sample of N from an infinite population . whose cumulants are k,,K,... we have from the 
definition of the polykays 
inal & g(K y.m,.. oyna) = Kp}. Ko, 


and accordingly it is sufficient to show that 
r!(7—1)! 
E g(Mg(1")) = 
a 2 "(pa)"! 


For, any other coefficients would imply a generally persisting polynomial identity between «,, ..., K,, 
and this by the general theory of moments does not exist. The relation (2) is easily seen to be true. If 
Y1> +++» Yy—n Aenote the residue of F after 7, ...,7, have been removed, 








7 7 7 7 
Ap}... Apd kp}... Kp}. (2) 











as N-n\,_ _ N-n _ = 
z-K, = ( N )@-n = N ((@—K,)—(Y—)). (3) 
Accordingly the rth cumulant of (—K,) in sampling from ¥ is 
ae om r 
( 7 *) [n+ (— 1) (N= n)--}K, = Ap. (4) 


Putting (4) into the general expression for moments in terms of cumulants we have (2) which is sufficient 
to prove (1). 

It may be noted that, while for the purposes of the present proof F has been assumed to be a sample 
from an infinite population %, since (1) is a purely algebraic expression for M g(1") in terms of X,,..., Xn, 
it may be used when the finite population F does not necessarily have any such provenance. 


Mapow’s CENTRAL LIMIT THEGREM 
We consider here the behaviour of u = (ky—K,) /n|/JKy 


under the double limit, N ~ o, n > o with 

5 =p+0(x) (0<p<1), 
holding for fixed p. The behaviour of the polykays is to be specified. In any event we have 

E(u) =0, E(u?) >1l—p. 

When each of the {K,/K}"} is bounded as N -> oo, then it is seen from the form of the relation between 
the general polykays and those of single suffix that 

KGH Kym, yen ~ KGV ES. KB 
is also bounded. Since A, = O(N!) 
it follows that the rth moment of u, i.e. 

n \tr 0, r odd, 

(z) al Bi ofl 1) (r—3)...1(1—p)#",  reven. 
Since the normal distribution is uniquely determined by its moments, by the Fréchét-Shohat Second 
Limit theorem u tends to be normally distributed with zero mean and variance (1 —). It will be noticed 
that we have required the K, to be bounded but not necessarily uniformly as with Madow. 


The same normal limit also holds if the (X,, ..., Xw) are regarded asa sample from one of a set of popu- 
lations which satisfy the conditions of a central limit theorem, by virtue of (3). 


Moments oF k, 
The corresponding relation to (3) is 


_ N-n N-n-1\,, n(N—n) _ _ 
by—K, = (5=*) 4,-( N-1 ) 5 eo 





where k, is the second k statistic of 21,22, ...,%,, and kj that of y,, ya, ...,Yn—n- This may alternatively 
nN 
(N—1)(N—n) 


be written as =) N=-n—1 
b—( 


N-1 


7 ) ii = hy-Kat 


(k, — K,)?. 





























Miscellanea 201 


The general form of the moments of k, in F is thus deducible from the moments of k, and k, in ¥ jointly, 
but not from those of k, alone. The same is true of the joint moments of k, and k, in F. The relationship 
is useful in that it enables us to exhibit these moments as rational functions of n with coefficients 
depending on the polykays. 

REFERENCES 


ABDEL-AtTy, S. H. (1954). Tables of generalized k-statistics. Biometrika, 41, 253-60. 

Barton, D. E., Davin, F. N. & Fix, Evetyn (1960). The polykays of the natural numbers. Bio- 
metrika, 47, 53-9. 

MapDow, W. G. (1948). On the limiting distributions of estimates based on samples from finite universes. 
Ann. Math. Statist. 19, 535-45. 

TuKEY, J. W. (1950). Some sampling simplified. J. Amer. Statist. Ass. 45, 501-19. 

WisHart, J. (1952). Moment coefficients of the k-statistics in samples from a finite population. 
Biometrika, 39, 1-13. 


A note on the quadrivariate normal integral 


By MAN MOHAN SONDHI 
Department of Electrical Communication Engineering, Indian Institute of Science, Bangalore, India 


In this paper we derive an approximate closed expression for the function 


io] ao « «oo 
P,(p) = I, [. I, I. P(X, Ly, Lg, X4, P) dx, dx,dx,dx,, (1) 


where 7%, 2», 23, %4,P) is the 4-dimensional normal density function with all off-diagonal elements of 
the associated correlation matrix equal to p. Thus P,(p) is the probability that all the four random vari- 
ables obeying such a distribution are simultaneously positive. To the author’s knowledge, no one has 
yet succeeded in expressing P,(p) in an exact closed form. McFadden (1956) has recently derived an 
expression which gives P,(p) accurate to 5 decimal places for 0 < p < 3, but the accuracy falls off rapidly 
for larger p. The expression derived in this paper (see equation (5) or (7)) matches the simplicity of 
McFadden’s expression, and has 5 decimal place accuracy for } < p < 1. By means of the relationt 


p by 3 ; PY 3 : oT aM 
-19~sin- +—sin- zs 2 
P. (5555 35 +P,(p) = sin-! p—sin i 3 Sin~* p sin i (2) 


which is proved in the appendix, iad may be obtained in the range —} < p < 0 as well. Thus a con- 
venient and sufficiently accurate representation of P,(p) is now made possible for the entire permissible 
range of p (—4 <p <1). 
To derive the expression mentioned, we first note that 
3 ‘n-1 3 
at 4n ee 27? 


This result can be proved, for example, by the use of an elegant theorem due to Price (1959); however, 
the proof is not detailed here. For convenience let 








sin~* p 
P,(p) = cosec—! (2+ cosec A) dA. (3) 
0 


sin~*p 
g(p) = cosec~! (2 + cosec A) dA. (4) 
0 


Now, intuitively, P,(1) = 4, as p = 1 implies perfect correlation of all the random variables. This fact 
may also be proved by substituting p = 1 in (2), and noting that P,(—4) = 0. (In fact for the n-variate 
case P,,(—1/(n—1)) = 0.) Using the value of P,(1) we get from (3) 

g(1) = 7?/24. 
Once g(1) is known, we may now expand g(p) as a Taylor series about p = 1, since the derivatives of 
g(p) with respect to sin-1(p) are all easily obtained. On carrying out this expansion to eleven terms 
(of which five vanish) and eR 4 in (3) we get 


= ag? 

ms 

where ees | _ 123 _ 13727 ~~ 1 3320309 
P=costp, a= Fae O=Tinss’ °= oi so04d 


P,(p) ~ 5 5 apa (47 +8in* 3) + (ate + ap? + bd* + of"), (5) 
(6) 


Equation (5) is the desired expression. 
+ This relation has been derived independently by J. A. McFadden. 





202 


It may be noted that since P,(4) = 0-2, g(4) = 


a 
120 


Miscellanea 


7, and hence g(p) could be expanded around 
p = 4. Although this might increase the range of validity of the expression for the same accuracy, the 


computation of the expansion coefficients around p = 4 is extremely cumbersome. 


A simple first-order Shanks transformation (1955) applied to the terms a¢? + b¢4 + c¢® in (5) gives 
(ac — b?) 6? — “ 


1 
Pip) = 3 


3¢ 


~ 2a 


where ¢, a, b, c are as given in (6). 


(37 +sin~ $) + 


a (zs +7 





For checking the accuracy of (5) and (7), P,(p) was calculated by Moran’s (1956) summation formula 
with an estimated accuracy to 8 decimal places. The comparison is shown in Table 1. It is seen that (7) 
is accurate to at least 5 decimal places throughout, and to at least 7 decimal places for p > 0:8. 





Table 1 
P(p) 
ete A. Y 
p Eq. (5) Eq. (7) Moran 
0-5 0-1999 9452 0-1999 9841 0-1999 9991 
0-6 0-2334 5168 0:2334 5266 0-2334 5234 
0-7 0-2706 8513 0-2706 8530 0-2706 8540 
0:8 0-3139 8525 0:3139 8543 0:3139 8544 
0-9 0-3693 1236 0-3693 1236 0-3693 1236 


The simplicity of the present method compared to Moran’s is evident from the following example. 
To obtain 5 figure accuracy (which is the least given by the present method) by Moran’s method one has 
to sum at least 12 terms for p = 0-5 and at least 26 terms for p = 0-9. Moreover, the computation of 
each of these terms requires the values of two functions, which have to be obtained by complicated 
interpolation from tables. In contrast, (7) requires only one value of cos~!p from tables, and far fewer 
terms have to be computed. 

The first term of Shanks’ second-order transformation was also computed, but it overcorrects the 
value at p = 4 to 0-200085. The attempt was not carried to more terms or higher-order transformations. 


The author acknowledges the helpful suggestions of Prof. J. A. McFadden. The financial assistance 
granted by the Council of Scientific and Industrial Research, India, is also gratefully acknowledged. 


APPENDIX 
To prove (2) it suffices to show that 





9(p) +055) = sin-1p sin (; 5 -) (8) 


sin=! p 
Now g9(p) = i) cosec—! (2+ cosec A) dA 
0 
p A d 
a at — aint 
= i sin (<a) Fe Adar 


ape eee a ees, poe a 
= sin-'p sin (5) I; sin Aj sin 142A dx. (9) 


-p he “ry a ete s a: 
Al ——] = 1{—__ } — 1 
so 555) a sin 142A lone Ada 


ee, d ., yb 
= —1 4 — -1 1 
if sin-! p = ( — =) dp, (10) 








where the change of variable of integration ~ = —A/(1+2A) is made. Adding (9) and (10), we get (8) 
and hence (2). 











so tha 





Sr oe OTS .—(‘Y 


D) 


0) 








Miscellanea 203 


REFERENCES 
McFappEn, J. A. (1956). An approximation for the symmetric, quadrivariate normal integral. 
Biometrika, 43, 206-7. 


Moran, P. A. P. (1956). The numerical evaluation of a class of integrals. Proc. Camb. Phil. Soc. 
52, 230-3. 


Price, R. (1959). A useful theorem for nonlinear devices having Gaussian inputs. J.R.E£. Transac- 
tions on Information Theory, 4, 69-73. 


SHanks, D. (1955). Nonlinear transformations of divergent and slowly convergent sequences. J. Math. 
Phys. 34, 1-42. 


On the stochastic matrix in a genetic model of Moran 
By J. GANIf 


1. INTRODUCTION 


In some recent work Moran (1958) has used a Markov chain embedded in a birth and death process to 
construct a genetic model for the random fluctuation of gene frequencies in a population of haploid 
monoecious individuals subject to mutation and selection. The model consists of a fixed number M of 
gametes of type a or A, and the system is said to be in state 7 when 7 of these are a, and M —j are A. If 
the probabilities of mutation of a gamete a to A and A to a are respectively a, a (0 < a, a < 1), the 
transition probabilities »;, from state 7 to state k of the stochastic matrix 


P(G,, Gg) = {754(G,%—)} (7,4 =0,1,...,M) 


for the embedded Markov chain of the system are shown to be 
P =14 P = + p,4 ~ Gs Pisa = hn Pir Pix=O (kK<j-1,k>j+l), (11) 
3,3-1 M 3? ee M 7 M j 3,94+1 M 3 j ? ? 


where py = 20-a)+(1-f) a a= fant (1-Z) ay). (1-2) 


The essential problem is to determine the rate of progress of the system to homozygosity; this depends 
on the largest characteristic root smaller than A,(@,,a,) = 1 of the matrix P(a,,a,). When a, = a, = 0, 
Hannan (see Appendix to Moran, 1958) has shown that the roots of the matrix P(0, 0) are 


w=) .. 
A(0,0) = 1-77 (7 = 0,1,2,...4.M) (1-3) 
so that in this case progress to homozygosity will depend on the largest non-unit root 
2 
A,(0, 0) = | ~ 


For a, + 0, a + 0, Moran uses an interesting ‘expectation’ technique to obtain information about 
the roots of the stochastic matrix: he considers the random variable X; = 7M~— after the ¢th transition, 
and proves that 

E(X141|X~) = M4 + X[1-M-"(a, + %)], 
E(X},,|X,) = a,.M-*+ X[2a,M-!4+ M-*(2—a, —3a)] (1:4) 
+ X}?{1—2M-"(a, +a) +2M-*(1—a,—@)]. 
From these results, he surmises that the coefficients of X, in E(X,,,|X;) and X? in H(X?},,|X,) are 
ray Ay(@1,%_.) = 1— M—(a, + as), (1-5) 
Ag(%q,%—) = 1—-2M-Ma, + a) + 2M-*(1 — a, — a), 


and concludes that since their values agree for a, = &, = 0 with Hannan’s results, his conjecture will 
be true at least for small «,, a». 


t Presently at the Australian National University; this work was partly done while the author was 
visiting Stanford University on an NSF grant. 





204 Miscellanea 


We show in this paper that for a, + 0, a, + 0 the latent roots A,(a,,,) must all be real and distinct, 
so that in fact Moran’s conjecture is exact. We also prove that if, as for the present model, we can express 
B(Xts1|X,) as E(X#,,|X,) = gp +,,X, +... +0,,X? (k= 0,1,...,M), (1-6) 


where the a; = @;;(%1, 2), then it follows that the a,, are the required latent roots of the matrix. 


2. PROPERTIES OF THE ROOTS OF P(«,,@,) 


It is well known that the stochastic matrix P for a positively regular Markov chain with M + 1 states 
has latent roots A=1, 1> lA,| > td >... > |Aml, (2-1) 
where the A; (j = 2,...,M) may be real or complex. 


We shall prove that if, for a positively regular Markov chain, P is the continuant matrix of transition 
probabilities Poo Por 0 


BP Me | Sepenakechengscevavsasvapaoroassiouseesosevendes (2-2) 
0 Pm-1,M-2 PM-1,M-1 PM-1,M 
*° PM,M-1 PMM 
with non-zero off-diagonal elements p; ;_, (j = 1,2, ...,M), p;,541 (j = 0, 1,..., M—1), then the roots of 
the characteristic equation Ay,,(A) = |AI-—P| = 0 (2-3) 


are real and distinct, so that Ag=1l>A,>A,>...>Am>-l. (2-4) 


As the proof is similar to that given by Ledermann & Reuter (1954) for the roots of the matrix associ- 
ated with a birth and death process, it will be only briefly outlined here. 

Let A,(A) = |AIM—P| (n = 1,2,...,M) be the nth leading minor of the determinant A5,,,(A) 
starting from the top. Then, defining A,(A) = 1, we have that 


A,(A) = A—Poos 
A,(A) = (A— poo) (A—Pu) — ProPonr 
and so on, where A ,,(A) will clearly be a polynomial of degree nin A. It may readily be shown by expanding 
the minor A,(A) (n = 2,3,...,M) that 
A,(A) = (A — Pu-1, n-1) A,-(A) — Pn-1, n-2P n-2, n-1 A,-2(A), (2-6) 


where Pn-1, n-2Pn-2, n-1 > 0. 
We illustrate the general argument by applying it to A,(A); from (2-6) we may write 


A,(A) = (A— py) Ay(A) — Pro Por Ao- (2-7) 
Now A,(A) has one real root p99, and when A = po. we have that 


A,( Poo) = —ProPAo < 9, 
since A, > 0. As A,(A) is a quadratic in A, and A( +00) = 0, it follows that A,(A) has two real roots 
separated by 9, the root of A,(A). 
In general, if the polynomials A,,_,(A), A,,(A) (r = 1,2, ...,[4M]) exhibit the sign patterns in (2-8), 
where the real roots of A,,_,(A) separate those of A,,(A), then from (2-6) it follows that A,,,,(A) will 
also have the sign pattern given. 


(2-5) 





Ag,_,(A): -04+0- .. +0-0+4 
A,,(A): +0-04+0-...-04+0-0+4 (2:8) 
Mel: —-0@40-@04 i ~640-04 
For, at the zeros of A,,(A), Ag,-.,(A) will always have a sign opposite to that of A,,_,(A), and since 
Aor4,;(©) = 00, Agy,,(—0O) = —00, there must be two further zeros in the two end regions where 


A,,(A) > 0. A similar argument may now be extended to A;,,.(A). By induction it follows that A y,,,(A) 
will have M + 1 real and distinct roots. We know further that A, = 1 and |A;| < 1(j = 1,...,M), so that 
the statement (2-4) is proved. 

In Moran’s model, the probabilities p;, are of the form (1-1), with off-diagonal elements of the stochastic 
matrix P(«,,a@,) being non-zero when «, + 0, a + 0; hence the roots A,(a,,a,) (j = 0,1,...,M) of this 
continuant matrix will be real and distinct. 

When @, = a, = 0, po, = Py, M-1 = 9, and the roots of the matrix P(0,0) as obtained by Hannan are 


H(J-1) 





A,(0,0) = A,(0,0) = 1, A,(0,0) = 1— (j = 2,...,M). (2-9) 


M2 











Their 





) 


) 


), 
ll 











Miscellanea 205 


Using his expectaticn technique, Moran finds two roots 
44(%,%) = 1—- M(x, +), (2-10 
digq(@1y, tq) = 1— 2M-1(ce, +49) + 2M-%1 —c2, — ag) , 


of the matrix P(«,,«,), and conjectures that these are respectively A,(a,,@_) and A,(a,, a), at least for 
small a, a. 
Since the roots of P(«,, a.) are real, distinct and continuous in a, % and since 


a,,(0,0) = 1 = A,(0,0), a_.(0,0) = 1—-2M-* = A,(0, 0) (2-11) 
it will follow that for all values of a,,a, > 0 
Ay1(O,H%y) = Ay(&y, Og), Agq(%y, He) = Ag(O,, He). (2-12) 


Thus Moran’s conjecture is exact. 


3. THE EXPECTATION TECHNIQUE 


Moran’s expectation technique depends on the fact that, for the particular continuant matrix P(c,, a») 
with elements (1-1), it is possible (cf. equations (1-4)) to express E(X,,,|X,) and E(X?,,|X,) respectively 
as linear and quadratic functions of X, 

E(X¢43|Xy) = ag, +4 Xy, 

E(Xiy1| Xp) = Qog + Gy. Xp + Aye XF. 
Here, the a;; = a,;(@, @,) are functions of «,, a, and X,= 7M~-!(j = 0,1,..., M) is the random variable 
giving the proportion of gametes a in a population of size M after ¢ transitions of the system. 

We note from (1-1) and (i-2) that the transition probabilities are themselves quadratic functions of X, 


(3-1) 


Px,,X,-M-1 = (1—) X,—(1— a, — aq) X} (X,= M-,2M-",...,1), 
Px,,X, = (1—cg) + (a, + 3a,—2) X,+2(1—a,—a_)X? (X,=0,M-,...,1), (3-2) 
PX,,X,+M-1 = %_+(1—a, — 2a_) X,—(1 — a, — &) X? (X, = 0, M-1,...,(MZ—1) M-). 


Their values are such that 


E(Xp44 —X,|X,) = M—" px, x,+m-1— Px,,x,-M—) 
= M~ [a —(&, +) X;], 


£ (3-3) 
E{( X14, —X,)?|X,) = M-*( px, x,4m-1+ Px,.x,-M-») 
= M-[a.+(2—a,—3a,) X,—2(1—a, —a,) X?]. 
From these, one may derive equations (1-4) which are clearly of the form (3-1). 
Since X,,, — X; can only take the values — M-1, 0, M-!, then for all integral r > 2, 
E{(Xt41— X,)*-|X,] = M-?r-VYa, —(%, + a_) X;], 
El (Xe41 —X)*"| Xp] = M-*[axg + (2 — a — Berg) X;— 2(1 — a, — Hg) XZ]. 
Hence 
2r-1 (2r—1 
(XX) = LF ( ' ) (— 19-2 XP EXP) + M-2-Y [ex — (cy + Og) Xi), 
gj=1 \ J 
2r /2r (3-4) 
E(XIX) = Y ( . ) (—1)!-? X} E(Xty’) + M-*"[xy + (2 — a, — Bag) X,— 2(1 — xy — Og) XF]. 
g=1\J 
Thus it is readily seen that for k = 0,1, 2,..., M, we obtain 
E(X#,,|X,) = op + Gy, Xet+--- + Oy, XE, (3-5) 
where the aj, = a;,(&,, 2). 
These relations may be expressed in the matrix form 
P(a, 0.) X = XA, (3-6) 
where the matrices 
1 0 0 ae 0 Aon 21 Ag «++» Aom 
1 M- M-* oe M-M O Gy Gy «.-- Ay 
1 2M-1 (2M-1)? ne (2M-1)” ae, ae 


(M-1)M- [((M—-1)M"“} ((M—1)M-*)™ 
1 1 1 





206 Miscellanea 


have elements X;, = (7M-1)*, Aj, = aj,(1,02) (7,4 = 0,1,...,M), respectively, A being upper tri- 
angular with ay) = 1. The matrix X clearly consists of M +1 linearly independent vectors, and will be 
non-singular. 
hereft i 

We may therefore write P(ce,,04) = XAX-1 (3-7) 
and it follows directly that the latent roots A,(«,, %,) will also be those of A, namely the diagonal elements 
yp (%y, Xp). 

4, EXPLICIT VALUES FOR THE ROOTS OF P(@,,@,) 


We now use the expectation technique to obtain all roots of P(«,,a@,). From (3-4), we see that for 
k = 3,4,...,.M, the coefficient of X¥ in E(X},,|X,) will be 


k (k 3 
a= > ( ) (—1)?* ay; ey (4-1) 
j=1 \J 
and it is readily proved by induction that 

Ay, (O1,%) = 1—kKM-(a, +a.) —k(kK-—1) M-*(1—@, —@,). (4-2) 

For if the diagonal elements a,; of A are of the form (4-2) for 7 = 0,1, 2,...,k--1 < M then 

k (k ; e 
ay, = (i) (— 1) {1—(k—j) Ma, + &) — (kJ) (k-j — 1) M-*(1 — a, — @)}. (4:3) 
j= 


This is readily reducible to (4-2) by using the identity 
k (k 
(w—1)* = a+ ¥ (i) (-1e 
j=1 \J 


and its first and second derivatives with respect to x, when x = 1. 
From § 3, the a;,; of (4:2) are in fact known to be the latent roots A;(a,,%_) of P(a,,@_). It should be 
noted that in this particular case a,,1 44; < @,, for all k = 0,1,...,M—1 since 


Gee — Ve41, 241 = Ma, + oq) + 2kKM~*(1 — &, — aq) 
= (a,+a,)M-1(1-—kM-) + kM-*(2—a,—a,) > 0 (4-4) 
so that the a,; are also ordered, and so are precisely the roots 
Aj (Oy, %q) = Ap (%y,%q) (kK = 0,1,...,M). 
For a, = a, = 0, these reduce to Hannan’s roots (2-9). 


Similar results for the stochastic matrices of Moran’s more complex models, in which the states of 
Markov chains are defined by two and four variates respectively, will be given in a further paper. 


REFERENCES 


Moray, P. A. P. (1958). Random processes in genetics. Proc. Camb. Phil. Soc. 54, 60-71. 
LEDERMANN, W. & Reuter, G. E. H. (1954). Spectral theory for the differential equations of simple 
birth and death processes. Phil. Trans. A, 246, 321-69. 


Note added in proof: For a comprehensive analysis of Moran’s model in continuous time, the reader 
is referred to: 


Karty, 8. & McGrecor, J. (1960). On a genetics model of Moran. Stanford University Technical 
Report. 


Departures from assumption in sequential analysis 


By W. J. EWENS 
Department of Statistics, University of Melbourne 


I. InrrRopvuctTIon 


In carrying out a sequential test, one is normally testing for the value of a parameter in a distribution, 
the density type involved being assumed known. This paper is concerned with the effect on such a test 
of wrongly specifying this density type. Particular reference is made to the case where the assumption 
of normality is made. We distinguish two cases; testing for the mean of a population, and testing for 











wher 


It m 

















Miscellanea 207 


the variance. A single formula is derived, covering both cases, from which complete sets of power and 
average sample size curves may be found. While the examples given cover only these two cases, the 
formula is valid for departures from any assumed distribution. 

As in the case of fixed-sample-size tests, we find that the sequential test for means is comparatively 
robust in respect to departure from assumed normality but the test for variances is very sensitive. 

2. SEQUENTIAL TESTING 
If we wish to test sequentially the hypothesis 
Hy: f(x) = f(z, Ao), 

against H,: f(x) =f (x, 41), 
with strength («, #), we continue testing while 


Bia) < Th flew Od {fee 0) < (1— Ayla Q) 
i= 


accepting H,(H,) if the lower (upper) inequality is broken. Equation (1) is more conveniently written 
in terms of logarithms 





n 
In{B[(1—a)} < By = ¥ 2% <In{(1—A)/a}, (2) 
w= e 
where 2; = In{f(x;, O1)/f (x, Oo)}. 
Wald (1947) has shown that if the true value of the parameter is 0, then P = Prob. (accepting H,) is 
given by te a 
pa (1=Ajar=1__ (s 
{(1 — f)/a}* —{B/(k —a)}* 
where h is the unique non-zero solution for ¢ in 
+0 
| Ey (2, 0,)/ fl, Oa)}*fler,0) dex = 1. (4) 


Such a value h can be shown to exist under fairly general conditions. The average sample number 
(A.S.N.) when the parameter takes the value @ is 


_ Pin{f/(1—«)} + (1—P) In {(1— £)/a} (5) 
o* E(z\0) ‘ 

where P is found from (3) and (4). In the case H(z|@) = 0, 

In {(1—A)/a} 


P= in{(—A)/a}—In {Bi —a))" 
—In {(1—£)/a} In {8/(1—@)} 
E(z*|@) ° 
It may be shown that if the null and alternative hypotheses are 
Hy: «nN(, 0°), 
H,: «nN(4,,0°), 
then h = (05+0,—20)/(0,—9), 
E(z|0) = (0, — 95) (20 — 0, —9,)/(20°). 
Similarly, if null and alternative hypotheses are 
Hy: xa N(0, 0%), 
H,: xa N(0,0%) 
equation (4) reduces to (o3/o%)' = 1—to*(1/o5— 1/03), 


A.S.N. 








AS.N. = 





where o? is the true variance value. The A.s.N. is given by (5) if we put 
E(z\o*) = 40°(1/0$— 1/07) —In {o,/0,}. 


3. DEPARTURES FROM ASSUMED DISTRIBUTIONS 


We now aim to generalize the above formulae to the case where the density type is wrongly specified. 
In doing we so will use neither Wald’s fundamental identity nor his artificial density functions, but 
simple difference equations. 





208 Miscellanea 


We suppose that the null and alternutive hypotheses are 

Ho: f(x) = f(x, 9), 

Ay: f(x) = f(x, 4) 
and that the test is of strength (a, 8). Suppose that 2 has the density g(x, 0), which may or may not be 
of the same type as f. Then z = In{ f(x, @,)/f(x, 9)} will have density ¢(z) given by 

g(x) dx = $(z)dz, z= 2(x). 

The test is naturally carried out in terms of &z,, since we assume the densities f in null and alternative 
hypothesis. We let p(y) denote the probability of accepting H, when the cumulative sum <z, is y, and 
n(y) the expected further sample size at this point. Then if we make the usual assumptions about negli- 
gible boundary overlap, we have 


+0 
ply) = | Ply +2) $(z) dz, (5) 
+0 
ny) = i) ny +z) A(z) dz+1. (6) 
Putting p(y) = exp (ty), we have an 
l= i) exp (tz) J(z) dz. (7) 


It may be shown that under similar conditions to those of Wald (1947), (7) has a unique non-zero solution 
for t. If we denote this solution by h, we may find the power curve by applying to the general solution 
p(y) = A+ Bexp (hy) the boundary conditions 
pPlln B/(1—a)}=1, p{ln(1—£)/a} = 0. 
These conditions fix the constants A and B, giving 
P = Prob. (accepting Hy) = p(0) 
~ exp [hln {(1—)/a}]-1 
~ exp [hln {(1 —A)/a}]—exp [hn (8/(1—2)}] 
{(1-A)/a}*—-1 


~ {= Ajay —(B/ =a)" 
Since (7) may be rewritten +0 
1 = [yep Boole 8) dr (9) 





(8) 


we find that (8) and (9) are sufficient to determine the power curve. We need only put any value of 0 
required in (9), solve for ¢, and substitute in (8). This generalizes the usual power curve formula. The 
A.S.N. curve is found as follows. Equation (6) is non-homogeneous, with particular solution 

ny) = —y/E(2). 
General solutions are therefore ny) = —y/E(z)+C + Dexp (hy), 
where C and D are arbitrary constants, and h is given by (7) or (9). C and D are fixed by the boundary 





conditions n{In {8/(1—c)}] = n[In {(1 — A) /a}] = 0, 
giving B(n) = n(o) = 7 ath soo Hei - Ais, (10) 


where P is found from (8) and (9). Thus the usual formula for the A.s.N. continues to hold if P and E(z) 
are interpreted correctly. 

The most interesting applications of the above formulae are made when both null and alternative 
hypotheses assume a normal distribution of the observations, since in this case we may compare the 
results with well-known fixed-sample-size test results. We consider both tests for means and tests for 
variances. 


(a) Tests for means 


(i) We first suppose that the true density of the variate is the double-exponential density 
g(x) = exp[--/2|x—pl/o]/(\/20) (-wo<x<+0), 
for which E(x) = pw, var (x) = o*, y, = 0, y, = 3. Suppose also that null and alternative hypotheses are 
Hy: «nan N(—},0°), 
Hy: xn N(+},0%). 
After some reduction (9) becomes exp (ut/o?) = 1 —¢?/(207). (11) 








As } 
con 








Se 


On the other hand, if x were normally distributed with mean yz and variance 0%, equation (9) becomes 


Let us fix o at any arbitrary value. Then for any value of 4, we may solve both (11) and (12) for ¢, and 
hence find two values of P. By varying pu we generate two distinct power curves. Finally, we may change 
o* and generate a further set of two curves. The results are tabled below for a = # = 0-05. It is clear from 
(11) and (12) that ~ = —y’ gives t = —?’ and hence P = 1—P’. Therefore only half of each curve need 


be given. 


Table 1. Probability of rejecting the null hypothesis {x N(—4,0%)} in favour of the alternative 
{xn N(+4,0*)}, the population sampled being (a) double-exponential (D.E.), (b) normal, with indicated 
mean and variance. 











Miscellanea 209 


—2y=t. (12) 





a= f=0-05. 
o?= 4 o?=1 
bee A ‘Y A “, | 
Mean D.E. Normal Mean D.E. Normal 
0-662 0-972 0-980 1-061 0-972 0-998 
-594 -962 ‘971 0-844 -962 -993 
+534 -950 *959 -693 -950 -983 
-474 -934 -942 *577 *934 -968 
*417 -913 *921 -482 913 *945 
0-358 0-887 0-892 0-402 0:887 0-914 
*307 *854 *859 *331 “854 *875 
+254 *813 “817 -267 -813 *828 
+202 *765 -767 -209 *765 -7173 
151 -708 -708 -154 -708 -712 
0-100 0-643 0-643 0-101 0-643 0-644 
-050 -573 -573 -050 -573 -573 
-000 -500 -500 -000 -500 -500 


Table 2. Probability of rejecting the null hypothesis {x N(—4,1)} in favour of the alternative 
{x 1 N( +4, 1)}, the population sampled being (a) rectangular, (b) normal, with variance 1 and indicated 


mean. 


(ii) We now suppose that the true density of the observations is of the rectangular form 
g(x) = 1/(2,/80), (u—J/3o <2 <pt/3o). 
Let Ho, H,, « and f be as in the previous example. Then (9) becomes 
exp (ut/o*) = ,/3t/{o sinh (./3t/o)}. (13) 


As before, we first fix o, and then solve (13) for various 4. It is also evident from (13) that the symmetry 
conditions of the previous example also apply here. 


14 










































a= f=0-05. 
Rectangular Normal 
0-972 0-958 
962 *949 
*950 936 
934 921 
913 901 
0-887 0-878 
+854 +846 
813 +808 
*765 ‘761 
-708 -706 
0-643 0-642 
*573 +573 
-500 -500 


Biom. 48 


210 Miscella iea 


We observe that if we base the sequential scheme on the usual assumption of normality, the probability 
levels for both populations are not far from those which hold in the case where the distribution of the 
observations is, in fact, normal.ft Since one of the populations considered has high positive kurtosis 
(+3) and the other negative kurtosis (— 1-2) we may infer that population kurtosis does not markedly 
affect the test. This result agrees with the corresponding conclusion for fixed-sample-size tests (cf. 
Pearson, 1931; Geary, 1947; Gayen, 1950; Box & Andersen, 1955). 


(b) Tests for variances 
For tests on variances, we have 
Hy: xn N(0, 03), 
H,: xn N(0,0%). 
We have shown that if x mn N(0,o*) and the test is of strength («, #), then P is given by equation (8), 
where h is the non-zero solution for ¢ in 
(o$/o3)' = 1—to*%(1/05—1/0}). (15) 
We can now use (9) to extend this to the case of non-normal observations. 
(i) Let us consider the double-y density 


(14) 


gn) = (2) <1 |el-1exp (—4a%4/o%) (16) 
2o%{ Tg) P : 
E(a") = 0, Ela?) =0%, d2=14+47% 


Table 3. Probability of rejecting the null hypothesis, for various variance values, the variate being 
from a double-x density with the indicated value of kurtosis. 


a=f8=0-05, of=1, of =4. 


Vs 
™ -1 0 (normal) 1 2 

o? \. 

4-752 0-99915 0-972 0-913 0-854 
4-000 -9972 -950 *877 813 
3-386 -991 913 +828 “765 
2-883 -972 "854 “765 -708 
2-476 913 -765 687 643 
2-130 0-765 0-643 0-597 0-573 
1-848 +500 +500 +500 -500 
1-616 +235 +357 +403 427 
1-421 087 +235 +313 +357 
1-255 028 +146 +235 +292 
1-117 0-0089 0-087 0-172 0-235 
1-000 00276 “050 +123 +187 
0-905 “00085 028 087 -146 


If we have null and alternative hypotheses as specified by (14), and use (16) for g(x) in (9), then after 
a little reduction the following equation for ¢ is found 

(o9/03)"* = 1—(1/o5— 1/03) to7/6 (17) 
which is clearly a generalization of (15). Equations (3) and (17) may be rewritten 

oo a ee. 
{(1 — A) /ay™ —{8/(1 —ax)}8” 

where h is the non-zero solution of (15) for ¢. Since 6 = (1+ 4y,)—}, we find a remarkable analogy between 
this result and the corresponding one for fixed-sample-size tests. Here Box (1953) has shown that the 
test statistic M, (used in the test for heterogeneity of variance) computed from a double-y distribution 
is distributed as (1+ 4y,).M, computed from a normal population. One may also conjecture that (18) 
is ‘asymptotically’ true for any distribution with finite cumulants. (‘Asymptotic’ here meaning for 





(18) 


+ Agreement is not so good in the double exponential case when o = 1 as when o = 4, but the dis- 
crepancy is far less than that obtained below when testing for variance. 























Miscellanea 211 


(o4—9%) small.) Equation (18) may now be used to compute power curves for various values of y,. 
The results are given in Table 3. Evidently non-zero values of kurtosis upset the probability levels 
to a marked degree. This result agrees with the corresponding behaviour of the test statistic for fixed- 
sample-size tests (cf. Box, (1953). In this paper the sequential test power curve is evaluated at the points 
where H, and H, are true, but by a different method than above.) 


(c) Average sample size 

We may use the above power curve tables, in conjunction with equation (10), to find the a.s.n. curves 
when normality is wrongly assumed in sequential tests. 

Since tests on means show small divergence in power curves from the nominal, the A.s.N. curves will 
also show only small divergence. However, there will naturally be large divergences for the test on 
variances, since in this case the power curves differ markedly. We use Table 3 and equation (10) to 
derive Table 4 below, which gives the A.s.Nn. curves for the distributions involved in Table 3. An inter- 
esting result is that for the case y, = 2, the maximum 4.8.N. occurs outside the range (03, 02). A joint 
examination of Tables 3 and 4 shows the extent to which ‘normal theory’ has now broken down. 


Table 4. Average sample sizes corresponding to the distributions of Table 3 


Ye 
-1 0 (normal) 1 2 

at \. 
4-752 2-70 2-55 2-23 1-91 
4-000 3-63 3-28 2-75 2-28 
3-386 5-01 4-22 3°35 2-71 
2-883 7-16 5-37 4-02 3-16 
2-476 10-33 6-63 4-68 3-58 
2-130 14-76 7-96 5-40 4:07 
1-848 18-06 9-03 6-02 4-51 
1-615 17-86 9-64 6-54 4-92 
1-421 15-19 9-75 6-88 5-26 
1-255 12-50 9-37 7-02 5-51 
1-117 10-55 8-87 7-05 5-69 
1-000 9-21 8-33 6-98 5-81 
0-905 8-89 7°86 6-88 5-90 


5. SUMMARY 


Comparison has been made between the effects of non-normality in a fixed-sample-size test and in a 
sequential test. Although different methods of investigation are necessary, the results obtained are 
similar. In both tests there is a comparative insensitivity to non-normality for tests on means, and 
& sensitivity to non-zero values of kurtosis for tests on variances. The tests have a similar asymptotic 
property. A formula for the a.s.N. curve for the sequential test has been derived. 


REFERENCES 


Box, G. E. P. (1953). Non-normality and tests on variances. Biometrika, 40, 318-35. 

Box, G. E. P. & ANDERSEN, S. L. (1955). Permutation theory in the derivation of robust criteria and 
the study of departures from assumption. J. R. Statist. Soc. B, 17, 1-26. 

XAYEN, A. K. (1950). Significance of the difference between the means of two non-normal samples. 
Biometrika, 37, 399-408. 

Geary, R. C. (1947). Testing for normality. Biometrika, 34, 209-42. 

Pearson, E. 8. (1931). The analysis of variance in cases of non-normal variation. Biometrika, 23, 
114-33. 

Wa tp, A. (1947). Sequential Analysis. New York: John Wiley and Sons, Inc. 





212 Miscellanea 


A note on some asymptotic properties of the logarithmic series distribution 


By J. C. GOWER 
Rothamsted Experimental Station 


SUMMARY 

Formulae are given to calculate 

vr 

S (x, R) = eg 

r=R+17 
the remainder term in the logarithmic series, which are valid for high values of R and values of x close 
to unity. These are of value when dealing with extreme values of the parameters appearing in the 
logarithmic series distribution such as have arisen in C. B. Williams’s recent study of the possible dis- 
tribution of the number per species for all the insects in the world. Bounds are given for the errors 
involved when making the recommended approximations. The calculation of the median of the dis- 
tribution is also discussed. 


INTRODUCTION 


The Logarithmic Series Distribution was introduced in an article by Fisher, Corbett & Williams (1943) 
who fitted it to the numbers of individuals per species in samples of Macro-Lepidoptera caught in a 
light trap at Rothamsted. Since then a number of papers, notably those of C. B. Williams, have used 
this distribution to describe data of diverse kinds. Recently Williams (1960) has investigated the pro- 
blem of assessing the possible distribution of the number per species of all the insects in the world. 
Attention is focused on the number of species with more than a specified number of individuals. When this 
specified number is very large and the logarithmic distribution is used, a knowledge of its asymptotic 
properties is needed. Williams (1960) remarks that ‘this is a complex problem which requires further 
investigation’. In this article some asymptotic properties of the logarithmic distribution will be listed 
and some indication is given of the range of values of the parameters of this distribution for which the 
results are reasonably accurate. Biologists should note these restrictions, since if the formulae are 
applied outside the range for which they are designed, inaccurate results may be obtained. 

Using the standard notation, suppose we have N individuals distributed in S different species. 
Then Fisher showed on quite plausible assumptions (Fisher et al. 1943) that the expected number of 
species with 7 individuals would be proportional to x‘/i where x is a parameter, having a value between 
0 and 1, which depends on N and S. If the factor of proportionality is « then the expected numbers with 
1, 2, 3, ete., individuals per species are ax, aa?/2, ax*/3, .... These are the individual terms in the series 
expansion of —alog(1—-) and it is for this reason that the series is known as the logarithmic series. 


The total number of individuals is ax+ax?+az°+... = ax/(1—2x). To estimate a and x we thus have 
two equations S = —alog(1—2), 
_ oe (1) 
~ Jame’ 


It has often been found that, using formulae (1) to estimate a and « for a set of data, the logarithmic 
series gives a good fit to the observations. Williams (1960) postulates that it might not be unreasonable 
to use this series to fit to the number of insects and insect species in the world, although a better fit might 
be obtained by using a truncated lognormal distribution. As rough estimates he takes N = 108 and 
S = 3x 10° and with these values calculates « = 1— 1-002 x 10-7 and « = 1-002 x 105. He then asks 
how many species there are with more than RF individuals. This quantity is 


8) 
aS(z,R)= SY aat/r. 
r=R+1 


The remainder of this article is concerned with the numerical evaluation of S(z,R) particularly for 
large values of R. 


EVALUATION OF THE REMAINDER S(z, R) 


An exact formula for the remainder term, true for all values of z and R is 


R - _ _ 
S(z,R)=- log(1—2)-¥ 1n— [ . Cue dd. (2) 
1 0 








whe 


we | 


Her 
tab 
hig] 


the 


whe 


is $1 


wit 

















Miscellanea 213 


R 
Toevaluate this formula for large values of R we deal first with the term >) 1/n. This has been tabulated 
; E 
by Glover (1930) for R = 2(1) 450. If these tables are not available or if larger values of R are of interest 
recourse can be had to the well-known formula 
R 
>Y 1/n = y+4R+log R—Up, (3) 
1 
where y = 0-5772156649 is Euler’s constant and 
a i‘ By 
BR 2R? 4R4" 6R® 
the B; being Bernoulli numbers. The series Up rapidly approaches zero as RF increases, specimen values 
being given in Table 1. 


Table 1 
R Ur 

10 0-00083 25039 

50 -00003 33320 

100 -00000 83333 

200 -00000 20833 

500 -00000 03333 

1,000 -00000 00833 

2,000 -00000 00208 

5,000 -00000 00033 

10,000 -00000 00008 
R 

It is thus permissible to replace > 1/n by 
1 
y+4R+logR. (4) 


If R > 100 the result is at most one figure out in the fifth decimal place. 
Returning to (2) we now try to find an expression for the integral term more amenable to calculation. 
A well-known inequality is 
e*> (1l-t/n)">e+*-—#/(2n) if O<t<n. 
Writing n = R and ¢ = $R, subtracting one, dividing each expression by ¢, integrating between 0 and 
(1—2) and noting that 


1—z e-RG—] : 
J ae ae ee ee ee (5) 
0 
we find that 
" 1-z(] ¢)®-1 
-Ei[-R(-2)] +og(R0-a)}+7> [/ a dd 


> —Ei[—R(1—2)]+log[R(1—2)]+y—}R(1—-2)?. 


Here — Ei(-—) is the exponential integral and has been tabulated by several authors, the most extensive 
tabulation being that of the New York W.P.A. (1940) where wu = 0(0-001) 10(0-1) 15. Results for some 
higher values of u are given by Akahiro (1929) where u = 20 (0-02) 50. We may thus write approximately 


1-—z (1 -_ )® —] ‘ 
"5 ae = —Ei[R(1—a)] +log R12) +7, (6) 
0 
the error being less than }R(1—«x)*. Combining (2), (3) and (6) 
1 
S(x,R) = —Ei[—R(1—2)]—55+Ur+V, (7) 
where V is an error term which is less than }R(1—-)?. A further simplification can be made when R(1 — x) 
is small. In this case 1-2(1—¢)R-1 
i ar dd = — R(1—2) (8) 
0 


with an error less than }R*(1 —)?. Thus we have 


R 
S(a,R) = —log(1—x)-—Y1fn+ R(1—2)+RV. 
1 





214 Miscellanea 


«© xX" 
An alternative approach, due to P. M. Grundy (unpublished work) is to express J = du in terms of 
R 
its Euler—Maclaurin expansion (Jeffreys & Jeffreys, 1950, pp. 278-83). This gives 
S(x,R—1) = Ei[Rlogx]+axF/2R (10) 


with an error less than (1 — R log x) x®/(12R?). At first sight this seems a neater result than those obtained 
above, and in fact for many values ofxand Ritis. Unfortunately, for very large values of R and x near 
to unity, x® must for ease of computation be replaced by e~®@-” and an investigation of the error in- 
volved in this approximation would invoke inequalities such as the one used previously. In fact, making 
this approximation, (10) can be shown to be equivalent to (7). 


THE USE OF THE FORMULAE 


The formulae appear at first sight to be rather complex, but in fact little difficulty should be found 
in using them. A summary of the situations when the different formulae should be used is given below. 


R 
(1) R2(1—2)? is small. Use formula (9). The error is less than }R*(1—<)*. If tables of }) 1/n are not 
R 1 
available, or if R is too large formula (3) may be used to evaluate }) 1/n. 


1 
(2) R(1—2)? is small. If R(1—<) is appreciable but R(1—<)? is small use formula (7). The error is 
less than }R(1—2)?. 
(3) If x is not too small use Grundy’s formula (10). 
These formulae should be sufficient for most calculations. 


Table 2. Calculation of the remainder term S(x, R) for 3 x 10° (= S) species and x = 1— 1-002 x 10-1% 


S(x, R) 
R S(x, R) _ (1 = log (1 —2) Notes Method 
1 28-932 3-3 _— 
10 27-003 9-8 Formulae (9) and (2) Direct s ~ 
(with the term 1/(2R)) 
gives 27-002 
10? 24-744 17-3 Direct summation gives Formulae (9) and (2), 
103 22-446 25-0 } the same results including the term 1/(2R) 
104 20-144 32-7 os 
105 17-842 40-4 a 
10° 15-539 48-1 oa Formulae (9) and (2) 
10’ 13-236 55:8 “= | (the term 1/(2R) is 
108 10-934 63-5 — negligible) 
10° 8-631 71-2 — 
101° 6-330 78-9 — , 
1044 4-038 86-5 These values agree with 
1012 1-823 93-9 | those obtained by 
formulae (2) and (9) + Formula (7) 
1018 0-219 99-3 — 
1014 0-000 100-0 — 





As an example Williams (1960) suggested values of N = 10'8 and S = 3 x 10° for the number of insects 
and insect species in the world and calculates « = 1— 1-002 (10-1), a = 1-002(105). With these figures 
and the formulae derived above Table 2 is produced. This agrees substantially with Williams’s Table 4 
(see the columns for S = 3 x 10°) and extends his table for the higher values of R. If the other two columns 
of Williams’s Table 4 are extended (i.e. S = 2 x 10° and S = 5 x 108), similar sets of results are obtained 
and the main conclusion, that the percentage number of species with more than a specified number of 
individuals per species remains fairly steady over the range of values of S considered, remains true for 
the higher values of R. 














Miscellanea 215 


THE MEDIAN 
The median of the logarithmic series is defined as the value of R satisfying the equation 
S(x, R) = —}flog(1—2). (11) 
If R is large (> 100) and R(1—2) is small, then (9) becomes S(z, R) = —log R(1—x) —log R—y so that 
the median. is given by —4log(1—x) = logR+y, i.e. 
Py 7 0-56146 

V(-x)  y(-2)’ 
This, apart from a term 0-81524, is the formula given on page 144 of Williams (1960) and attributed to 
P. M. Grundy. The 0-81524 given should there occur as an additive term and not a multiplicative one 
as printed. 

A further stage of approximation adds in the Grundy terms (4+ e-?7), but for high values of S this is 
of course quite trivial. In many ways the best approach appears to be to produce a table similar to 
Table 2 above and then interpolate to find the value of R—1 corresponding to 50 %. For Table 2 this 
would occur between R = 10° and R = 10’ and where the curve of S(x, R) is linear relative to a log-scale 
of R. Noting that —log(1—x) = 29-932, linear interpolation gives 
15-539 — 4(29-932) 

7—logR i 
log R = 6-2488, R = 1773000, the value obtained by formula (12). If the median occurs on the curved 


portion of the S(x,R) curve then some more elaborate form of inverse interpolation using, say, four 
adjacent points on the S(x, R) curve, will have to be used. 


R 





(12) 


15-539 — 13-236 = 


REFERENCES 


Fisuer, R. A., Corset, A. 8. & Wrixiams, C. B. (1943). The relation between the number of species 
and the number of individuals in a random sample of an animal population. J. Anim. Ecol. 
12, 42-58. 

Wittiams, C. B. (1960). The range and pattern of insect abundance. Amer. Nat. 94, 137-51. 

GuoverR, J. W. (1930). Tables of Applied Mathematics in Finance, Insurance and Statistics. Ann 
Arbor, Michigan: Wahr. 

New York W.P.A. (1940). Tables of Sine, Cosine and Exponential Integrals, 1 and 2. 

AKAHIRO, T. (1929). Sct. Pap. Inst. Phys. Chem. Res. Tokyo. Table no. 3, pp. 181-215. 

JEFFREYS, H. & Jerrreys, B.S. (1950). Mathematical Physics (second edition). Cambridge University 
Press. 


On a property of balanced designs 


By M. ATIQULLAH 
Birkbeck College, London 


1. INTRODUCTION AND SUMMARY 


Special classes of balanced designs are well known and widely used. In this paper, a necessary and 
sufficient condition for a general class of connected designs to be balanced is derived. This is a natural 
extension of a result of Tocher (1952) and of Thompson (1956), and it appears to be simpler than the 
generalization given by Rao (1958). A simple expression for calculating the efficiency factor for a con- 
nected balanced design is obtained. 

Fisher (1940) established that b > v for a balanced incomplete block design with v treatments and 
b blocks. This inequality is shown to be true for a wider class of designs, similar to the balanced incom- 
plete block designs but with blocks of different sizes. 


2. NOTATION AND PRELIMINARY RESULTS 
Consider v treatments arranged in b blocks in a design whose incidence matrix is N = (n,;), where 
n,; denotes the number of experimental units in the ith block getting the jth treatment. The ith block 
is of size /:; (i = 1,2,...,b) and the jth treatmen tis replicated r; times (j = 1, 2,...,v). These may be 








216 Miscellanea 


conveniently written as the diagonal matrices K = diag (k,,...,k,), R = diag (rj, ...,7,). The total yield 
of the jth treatment is 7’; and that of the ith block is B;. On writing T’ = (7,,..., T,) and B’ = (B,,..., By) 
in matrix notation, the prime denoting transposition, the adjusted normal equations for estimating the 
vector of treatment constants t can be written under the usual asumptions as 


ct=Q, (1) 
where Q = T-N’K-1B, C=R-—N’K-“"N. 


The dispersion matrix of Q is known to be o°C, where o? is the intra-block error. Since each row (or 
column) of C adds up to zero, the rank of C is at most v—1, and e’ = (v-+, v-#, ...,v-+) is the latent 
vector corresponding to the zero root. If the rank of C is v— 1, the design is said to be connected (Bose, 
1950), and in that case the solution of (1) is given by 


== (r,U-T))Q. (2) 


Herel = (e:T,) isan orthogonal matrix transforming C into a diagonal matrix u = diag (u,, Us, ..., Up_1); 


whose elements are the non-zero latent roots of C, and 7, = é,—, (# = (1/v) aI). The dispersion matrix 
of t follows from (2) as o?D, where D=T,u-"T%, (3) 


and each row (or column) of D adds up to zero. 


3. CONDITION FOR A DESIGN TO BE BALANCED 


A design is said to be balanced if every elementary contrast 7;—7,is estimated with the same variance. 
We have the following theorem. 


THEOREM 1. A necessary and sufficient condition for a connected design to be balanced is that every 
treatment constant is estimated with the same variance and every pair of treatment constants with the 
same covariance. 

To prove the necessary condition, we can write, following Kempthorne (1956), the average variance 
for elementary contrasts, 7;—7,, as 

20? 
ee var (7;—T;) = —, 4 
oa PY 5-9) = (4 
where H is the harmonic mean of the non-zero latent roots of C. 
Since the trace of the matrix D is equal to the sum of its latent roots, we get 


+. 
persa ge. (5) 
Also, var (7, —7,) + var(7,—7,) +... + var(7,—7,) = vvar(7,) +> var (7,), (6) 
j 
var (7, —7;) + var (7g—7,) +... + Var (7,—Ty) = v var (7) + > var (7;), (7) 
j 


for the rows of D add up to zero. 

Clearly, var(7,) = var(7,) if the design is balanced (i.e. if every elementary contrast, T,—T,, is 
estimated with the same variance). 

Hence, it follows from (5) that 


var (7,) = ¢ 1 : 8 
7; -_ H aa ( ) 
and the relation (4) gives 

ar (7,—4,) 20 9 

Vi _ = 
r(7;—7; Ht’ (9) 
or 2 var (7,) —2cov (7;,T;) = ie (10) 

AA o* 

or San ena 
Cov (75, Ty) He (11) 


For the proof of sufficiency, note that if var (7;) and cov (74, 7,) are independent of j and j’, so obviously 
is var (7,) +var (Fy )-—2cov (747 Ty). 











Hi 


or 


or 


Sir 


or 


un 


Fu 


wh 


w@ 


) 


is 





Miscellanea 217 


4, EFFICIENCY FACTOR FOR A CONNECTED BALANCED DESIGN 


Ifa connected design is balanced, the non-zero latent roots of C are identical, u, = uy = ... = u,_, = H, 
and we have from (2) and (3) that 
nine a . 1 
t= 7 (L—ee)Q= FQ, (12) 
1 
D= A be— ee’). (13) 
Hence cov (Q, Q’) = H? cov (#,%’), (14) 
or C=HbD, (15) 
1 1 1 
ta aie ane 
v Vv v 
1 1 1 
or R-N’K-1N = HD=H —-- l-- .. 0 -= |. (16) 
v v v 
1 1 1 
mx ares ous 
v v v 


Since trace (N’K-1N) = trace (R—HD), we have 


1 
Lz Vs = Us— A(v—), (17) 
as j j 
1 
Uni-Lz Uns 
or SS = den Binks:. (18) 
v—1 
Hence, the efficiency factor (E.¥.) relative to a randomized block with the same number of experimental 
units n (= Yr; = > &,) is given by 
j 7 


1 2 
I al 
EF. = — = (19) 
niv n{1 —(1/v)} 


We call a design binary if no treatment occurs more than once in the same block, i.e. ifn,; = 0 or 1. Then 
ni, = N4 and for a connected binary balanced design we have 








n—b 
o—, 20 
7 (20) 
n—b 
.¥F. = ———_—_. 21 
me = (po) _— 
Further, if the design be equi-replicate, 
or—b 
= —, 22 
= r(v—1) _ 
WhO %, = fp =... = =F. 
5. EXTENSION OF FISHER’S INEQUALITY 
THEOREM 2. For a connected binary equi-replicate balanced design, 6 cannot be less than v. 
On substituting the value of H = (vr—b)/(v—1) in (16) we get 
b or—b or—b 
v wv—1) “" wv—1) 
or—b b or—b 
NK N=] yv—1) » “*  wv—1) |]- (23) 
or—b or—b b 











218 Miscellanea 


Hence, the ‘determinant IN’K-N| = (= _ a 0 
Suppose v > b. Then |N’K-!N| = 0 and consequently b = r, which contradicts the fact that the design 
is binary. Hence v < b. 

It is interesting to note that for v = b and consequently r = (k,+k,+...+,)/v, we have the deter- 
minantal value from (24) as 





— v-1 
[N’N| = (ky ky -.. ky) (=) r (25) 
vo—r\?-1 
< rm ) k (26) 
v—l1 


Since the left-hand side of (25) is an integral square, so also must be the right-hand side. Hence (25) 
can be used as a test for the possible existence of a connected binary equi-replicate design having 
specified values of v = 6, ky, ..., ky. 


I am indebted to Dr D. R. Cox and the referee for suggesting a number of improvements to the 
original draft. 
REFERENCES 


Bosg, R. C. (1950). Least Squares Aspects of Analysis of Variance. Institute of Statistics, University 
of North Carolina. 

FisHer, R. A. (1940). An examination of the different possible solutions of a problem in incomplete 
blocks. Ann. Eugen., Lond., 10, 52-75. 

KemptTHorne, O. (1956). The efficiency factor of an incomplete block design. Ann. Math. Statist. 
27, 846-9. 

Rao, V. R. (1958). A note on balanced designs. Ann. Math. Statist. 29, 290-4. 

Roy, J. (1958). On the efficiency factor of block designs. Sankhyd, 19, 181-8. 

TxHompson, W. A. (1956). A note on balanced incomplete block designs. Ann. Math. Statist. 27, 842-6. 

Tocuer, K. D. (1952). The design and analysis of block experiments. J. R. Statist. Soc. B, 14, 45-91. 


Aliasing in partially confounded factorial experiments 


By M. J. R. HEALY anv J. C. GOWER 
Rothamsted Experimental Station 


In a factorial experiment with factors at p levels with p > 2, the confounding structure is usually 
determined by breaking down the interactions into groups of (p—1) contrasts by means of systems 
of linear congruences (see, for example, Kempthorne, 1952). For example, to confound a 3° experiment 
in blocks of 9 units each, we may allot each treatment to block 1, 2 or 3 according as 


(level of Ist factor) + (level of 2nd factor) + 2 x (level of 3rd factor) = 0, lor2 (mod 83). 


This procedure will confound a pair of degrees of freedom belonging to the 3-factor interaction which 
may be denoted by ABC?. The other 6 d.f. from this interaction are similarly denoted in pairs by AB?C, 
AB*C? and ABC, respectively. This type of subdivision has no particular meaning in the interpretation 
of the results. 

Particularly when the factor levels are quantitative, a different kind of breakdown is often of interest. 
For example, each main effect can be broken down into (p—1) orthogonal polynomial contrasts, and 
these induce corresponding breakdowns of the interactions. When an experiment is confounded using 
one set of contrasts and analysed using another, certain difficulties arise which are closely analogous to 
the problems of aliasing in fractionally replicated experiments. This fact was originally pointed out by 
Yates (1937, §12c) but does not seem to have attracted much attention since that time. 

The least squares estimates of a set of treatment contrasts can readily be obtained. Consider an 
experiment with r replicates of ¢t treatments carried out in b blocks of k units each. Its design can be speci- 
fied by the incidence matrix N, of dimensions t x 6. The rows of N correspond to the treatments and the 
columns to the blocks; N,, = 1 if treatment 1 occurs in block 7, otherwise N,, = 0. The norrnal equations 


can be written rl N t T 
[nr allel =Le)> 














Miscellanea 219 


where t and b are the vectors of treatment and block constants, and T and B the vectors of treatment 
and block totals. Eliminating b, we find that the normal equations for t can be written 


(1-;NN7) t=Q, 


where Q = T —(1/k) NB is the vector of treatment totals of deviations from block means. 
Suppose next that an orthogonal set of treatment contrasts v is defined by 


v = Gt, 
where G is an orthonormal matrix, the elements in the first row being all equal. Then the normal equa- 


tions for v are easily shown to be 1 
(x1 - iM") v= GQ, 


where M = GN. Difficulties in analysis and interpretation may arise when the matrix on the left-hand 
side of this equation is not diagonal. 

As a simple example consider a single replicate 3? experiment in blocks of 3 units, confounding A B?. 
The incidence matrix is 0 £F 


= 


Treatment 11 


—moocrooor- 
oorr OOCOrFS 
COorooorr co So 








To analyse this in terms of linear and quadratic components, we take G to be GG, where G, is equal to 








ra 1 1 1 1 1 1 1 1 
A, -l-1-1 0 0 0 1 1 1 
Ag =~ —§ —~§ § £ § <2 -2 ~3 
Br -l 0 1-1 0 1-1 0 1 
Be ~t watt Bak on eee 
A, Br 1 0-1 0 0 0-1 0 #41 
ArBe i-s 1080 6-1 $-3 
AgBi t #eaf ay 4 2 P'S = 
AgBaL 1-2 1-2 4 -: 1-2 1 7 
and C is a diagonal matrix that normalizes the rows. 
The information matrix for these contrasts is 
woe 8 86 © @ 0 Oe 
ee oe ee OD 
eS *» + ©..80 © 2. re. 8 
1 SC 8 8 1-82 Ce B® 
rI—--MMT={0 0 0 0O 1 0 0 0 90 
k S6¢e. 8©@ & @ @ 3 
° © © € he te £ 2 8 
o 68 © € 8 @.3 2 @ 
oe ¢ 2 6 5 wee 








From this it will be seen that Az Br, is completely aliased with AgBg, and A,Bg with AgBy,. These 
contrasts can be estimated only if some linear relationship between the members of the pair, with known 
coefficients, is assumed to hold. 

If two replicates of the experiment are used, with AB confounded in one and AB? in the other, balance 
is attained and this type of difficulty does not arise. For, in terms of the contrasts used in the con- 
founding, each section of the information matrix that corresponds to interactions of a given order will 
be simply a multiple of the unit matrix, and this property is unchanged when a different breakdown of 
the interactions is imposed. 

Certain common designs will now be examined in terms of orthogonal polynomial contrasts. 








220 Miscellanea 


1. 3x3x3 IN BLOCKS OF 9 


There are four types of replicate, each confounding a pair of d.f. from the 3-factor interaction. If a 
single replicate is used (confounding, say, ABC), the eight polynomial-type contrasts fall into two sets 
of four, such that the contrasts within each set are aliased together. With two or three replicates, con- 
founding a different pair of d.f. in each, the information matrix is no longer singular, but it still contains 
off-diagonal terms so that explicit matrix inversion is necessary for complete solution of the normal 
equations. The resulting estimates are correlated, and the situation may be referred to as partial aliasing. 
With four replicates, balance is obtained and all the estimates can be found by simply dividing the 
appropriate Q contrast totals by 3, the relative information (}) times the number of replicates. 


2. 3x 2x2 IN BLOCKS OF 6 


Yates (1937) gives a balanced design in 3 replicates. Because the confounding is balanced, the poly- 
nomial contrasts can be obtained from the contrast totals of the Q’s by dividing by three times the relative 
information. 


3. 3x 3x2 IN BLOCKS OF 6 


Here balance requires four replicates. If only two replicates are used, partial aliasing occurs in both 
the AB and ABC interactions. In particular, A; By is partially aliased with AgBgand A, Bag with AgByz. 


The occurrence of partial and total aliasing is of importance when considering the analysis of factorial 
experiments on electronic computers. Thus Tocher (discussion in Yates, Healy & Lipton, 1957), has 
suggested an extension of Yates’s ‘plus-and-minus’ technique to factors at more than two levels—in 
essence, the main effects are broken down into single degrees of freedom and the matrix G is then built 
up as a direct product of smaller matrices, providing a rapid and compact scheme for the formation of 
the vector GQ. However, it is still necessary to invert the information matrix, and when this may be 
no longer diagonal a fully generally program is likely to become cumbersome. 


REFERENCES 


Kemptuorne, O. (1952). The Design and Analysis of Experiments. New York: John Wiley and 
Sons Inc. 

Yates, F. (1937). The design and analysis of factorial experiments. Tech. Commun. Emp. Bur. Soil 
Sei. 35. 

Yates, F., Hearty, M. J. R. & Lipton, 8. (1957). Routine analysis of replicated experiments on an 
electronic computer. J. R. Statist. Soc. B, 19, 234-64. 


Studies in the History of Probability and Statistics. XII. The Book of Fate 


By M. G. KENDALL 


Research Techniques Division, London School of Economics 


1. Divination by chance mechanism is a very ancient practice and is reported from many countries: 
China, Tibet, Greece, Rome and among the Germanic tribes known to the ancient world. Sticks and dice 
were both in use. This, for example, is Tacitus’s account of the Germani: 

‘To divination and the drawing of lots (auspices sortesque) they pay as much attention as anyone. 
The method of drawing lots is standardized. They take a bough from some nut tree and cut it into strips, 
on which certain runes are written. These strips are scattered at random on a white cloth. Then the priest 
(if it is a public occasion) or the head of the household (if it is a family matter) prays to the gods, turns 
his eyes to heaven and take up three sticks, one at a time. The runes so collected are then interpreted.’ 


2. This is one of the earliest instances I know of divination in Europe by reference to a previously 
prepared script. The Romans also had their Sybilline Books. It was perhaps more common in Rome, 
and not uncommon among the Germanic tribes, to divine from natural phenomena such as the voices 
of birds or the entrails of a sacrifice. Mantic practices of all kinds flourished under the later Roman 
Empire until Constantine’s time, when the official adoption of Christianity put a stop to the investiga- 
tion of God’s will by lot. Occasional attempts at revival were discouraged. Three medieval prelates who 














“=, 5S fF ff A KA A A 


—- = © 


ade] 





les: 
lice 


ne. 
ips, 
iest 
ms 


ed.’ 


isly 
me, 
ices 
nan 
iga- 











Miscellanea 221 


were each claiming possession of the bones of a saint decided, in what nowadays would probably be 
regarded as a very sporting spirit, to settle the dispute by drawing lots. But they were reprimanded for 
doing so and their agreement was cancelled by higher authority. 


3. Vestiges of the religious practices of one age provide the superstitions of the next and the pastimes 
of the one after that. The Gods were transformed into demons and then into fairies; their rites become 
customs; divination dwindled into fortune telling. But, although consulting the oracle may have become 
only an entertainment it was taken seriously enough to survive every effort to suppress it. There does not 
exist much documentary evidence of the practice before the advent of printing but occasional refer- 
ences are enough to show that it never disappeared. In an earlier article (1956) I have referred to some 
of the medieval poems like the ‘Chaunce of the Dyse’ which told fortunes or character by dice throwing. 
The so-called sortes Virgilianae seem also to have been practised over the period among those who were 
lucky enough to have books. A volume of Virgil was opened at random, a passage also chosen at random 
and then applied to the situation requiring elucidation. (Virgil in the dark ages was regarded as a 
notable necromancer.) Some comments of striking aptitude are reported but the only example I can quote 
is modern, probably apocryphal and illustrates the depths of degradation to which Sortilege has 
descended: when Margaret Bondfield, as Minister of Labour, was about to move a resolution requiring 
the renewal of unemployment assistance, known at the time as ‘the dole’ the sortes are said to have yielded 
the famous line from the Aeneid: 


‘Infandum, regina, jubes renovare dolorem.’ 


4. The advent of playing cards and of printing revived interest in this type of entertainment. In 
1540 Francesco Marcolini of Forli published Le Sorti, a large and elaborately illustrated volume for 
consulting the oracle by using a pack of playing cards. This must have been a fairly popular work, for 
a second edition appeared in 1550. The scope and elaboration in this book suggests that there may some- 
where be earlier prototypes. Marcolini has 50 questions, 13 for men, 13 for women and 24 for men and 
women. To each question he has 90 answers written in rhymed triplets, 4500 answers in all. The sortilege 
consists of going from a question to one of the 90 corresponding answers by an elaborate piece of mysti- 
fication involving the drawing of cards at three different stages. The questions themselves are an inter- 
esting collection of the topics which exercised the minds of the gentlefolk of sixteenth-century Italy, 
and most of the politer ones would be found in any fortune-telling book of today. For example, question 3 
for men is: whether it is better to choose a beautiful or a plain wife (bella o brutta). One possible answer 
to this (to take an example offered in illustration by Marcolini himself) is 


‘Una brutta e una bella ne torrai 
Se vuol la legge; et se no vuol, sei matto 
Se una sol moglie e brutta pigliorai.’ 


This, I take it, means: ‘Have one of each if the law permits, but if not, you would be crazy to choose a 
plain one.’ 


5. For students of combinatorial analysis and experimental design there is some mild interest in the 
mechanism by which Marcolini gets from question to answer. He works with a pack from which four 
cards of each suit, 3, 4, 5, 6 have been stripped. In point of fact, he makes no distinction between suits, 
so he might just as well work with a pack of nine cards of one suit. Two are chosen (effectively with re- 
placement) so that there are 45 possible pairs (the order of a pair being immaterial). These 45 are dis- 
played in full on a single page, one page to each of the 50 questions. The 45 possibilities, however, are 
divided into five groups of nine so that effectively there are only five possibilities per question. Marcolini 
then adds two more stages each involving choice of one card. But he again reduces the possibilities so 
that effectively there are only 5 x 18 = 90 per question. How or why these numbers were chosen is a 
mystery. Personally I do not feel much confidence in Marcolini’s command of combinatorial analysis. 
There appear to me to be places where he failed to preserve the balance of his incomplete blocks and just 
faked them. But it is easy to be superior with the advantage of four centuries’ experience, and his 
drawings are fascinating. 


6. In Le Sorte we have the last stage of degeneration of the old sortilege, a deliberate obfuscation of 
the selective process. Henceforward the Book of Fate is of no scientific interest. It stillexists. [recently 
bought a publication called Napoleon’s Book of Fate part of which is evidently a lineal descendant of 
Le Sorte. 


7. One curious development in the musical field may be worth recording. In 1757 a certain Johann 
Phillip Kirnberger published in Berlin a book which enabled Polonaises and Minuets to be composed by 
throwing dice. What is more, they were composed for two violins and pianoforte. This game seems to 
have caught on. Haydn is credited with a Guioco Filarmonico published in 1790 for the composition of 











222 


minuets. To C.P.E. Bach has been attributed a publication of the same year for composing waltzes. In 
1793, after Mozart’s death, there was published (in four languages) a similar method, of composing with 
two dice. This work is attributed to Mozart in the Kéchel—Einstein index (Anhang 2944), It reappeared 
in 1806 in England (C. Wheatstone, London) as ‘ Mozart’s Musical Game, fitted in an elegant box, showing 
by an easy system to compose an unlimited number of Waltzes, Rondos, Hornpipes and Reels.’ I have 
not been able tc trace the Wheatstone edition or to find out what was in the elegant box, but the German 
edition and Kirnberger’s work are both in the British Musum. 

These works are prefaced by fairly lengthy descriptions of how to apply the tables which are given. 
They give no account of the method of construction. I am certain that the general method in all these 
works must have been the same. We write down a simple harmonic sequence of, say, eight chords, 
finishing on the tonic. We can then compose a large number of bars from each of these chords by using 
notes of the triads or their inversions. We choose eleven of them and number them from 2 to 12 corre- 
sponding to the total points on two dice; and hence we can pick out one at random. 

Likewise we can do the same for each of the eight chords in the sequence. And if we put the resulting 
eight bars together in the right order we shall, with a little luck, have an acceptable air. The same pro- 
cedure can be followed with the left hand. 

There only remains to put in the mystique. This is done by arranging our 88 bars in random order and 
then providing a table to unscramble them. 


Miscellanae 


8. The student of probability and statistics will, I fear, derive little of relevance to his subject in 
these fortune-telling books and games. The chance mechanisms are of interest, perhaps, but there is no 
sign that, in any of the cases I have described, there was any appreciation of relative frequency. 


REFERENCES 

Marcotint, F. (1540). Le Sorte. Venetia. 

KENDALL, M. G. (1956). Studies in the history of probability and statistics. II. The beginnings of a 
probability calculus. Biometrika, 43, 1. 

KiRnBERGER, J. P. (1757). Der allerzeit fertige Polonaisen und Menuettencomponist. Berlin: G. L. 
Winter. 

K6écuzt, L. von (1937). Chronologisch-theoretisches Verzeichnis Sémtlicher Tonwerke W. A. Mozarts. 
Dritte Auflage, bearbeitet von Alfred Einstein. Leipzig: Breitkopf und Hartel. 


A derivation of the Borel distribution 


By J. C. TANNER 
Road Research Laboratory, Department of Scientific and Industrial Research 


In a queueing process with random arrivals at a rate q per unit time and constant service time f, the 
number of units served during a busy period follows a Borel distribution 


e-n44(n fq)" 


Pri = ae 


on’ 32,203). (1) 
This was shown by Borel (1942), whose argument was extended by Tanner (1953) to show that the 
distribution of the number of units served in a busy period starting with an accumulation of r units is 
e-nfa nT yp yn-Tr-l 
Par = — sn a i - (n=, r+1,...). (2) 
n—r! 
The object of this note is to give a simple derivation of the distribution (2). 
The fact that the expression (2) may be put into the form 
e-"ha(nBq)"* r 
Pas = — (nBay’ -x— (n=r7,r+l,...) (3) 
n—r! n 
suggests that there may be a simple derivation based on the Poisson distribution. The following are 
necessary and sufficient conditions for a busy period to contain just n units: 
(i) precisely n—r units shall arrive within a time nf; 
(ii) the pattern of these arrivals shall be ‘admissible’. The term ‘admissible’ is used here to mean that 
at least one unit must arrive during a time rf, at least two within (r+ 1) 8, and so on, so that the server 
will remain continuously occupied for the whole of the time n/. 








- OWN =a eH 


L. 


she 


the 
sis 











Miscellanea 


The probability of (i) is clearly ow (nfq)"* 





n—-r! 


and it thus remains to show that the probability of the n—r arrivals being admissible is r/n. 

Consider any particular arrival pattern; suppose that there are a, arrivals in the first service period, 
da, in the second and so on (the timing of the arrivals within each service time is irrelevant). Consider 
now the set of n arrival patterns obtained by permuting this pattern cyclically. It will be sufficient if 
we can demonstrate that precisely r of these n patterns are admissible. 

Suppose that instead of there being r units waiting initially there are an arbitrary number N, greater 
than n+7, and that the arrival pattern a,,d@,,...,a@, is repeated in the (n+ 1)th, ..., 2nth service times. 
Thus the n cyclical permutations of the arrival pattern are obtained by taking the n service times 
starting with the Ist, 2nd, ...,nth. Consider how the queue size will vary during these 2n service times. 
At the beginning of the first service time, it will be N (including the unit starting to be served), at the 


Units of service time 


123 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 
, , e ToT Vv... . : 8 





wT 2 e 8 Pia OO F9 


' ' | y 

Number of arrivals 0 | 20 a | a a | 200 | 

during service time | | | | | 
| | | | | 


Admissible arrival aaa Fee A hae TER | 


patterns B + 





| 
Nor | 

N-1F- 
Queue size N-2- | | 
at beginning of N=3— l | 
service time N-4— ; 

N—5 ” 

N--6 





Fig. 1. Queue sizes at the beginning of successive service times, 
for illustrative example with n = 10, r = 3. 


beginning of the second, N + a,— 1 and so on. At the beginning of the (n + 1)th it will be N —r, since there 
will by that time have been n—r arrivals and n completed services; at the beginning of the (2n + 1)th 
it will be N — 2r. More generally, the queue sizes at the beginnings of the (n + 1)th, ..., 2nth service times 
will be just r less than they were n services times earlier. This means that there must be precisely r 
service times out of the (n+ 1)th,..., 2nth at the start of which the queue size is less than it has pre- 
viously been. For if the smallest queue size at the beginning of any of the first n service times was N(k) 
for the kth service time, then the smallest for the next n must be N(k) —r for the (k+n)th, and since the 
queue size can diminish by at most one per service time, there must ber — 1 out of the service times between 
the (n+1)th and the (k+n—1)th inclusive at which the queue sizes decreased for the first time to 
N(k)—1, N(k) —2,...,N(k)—7r+1. Also these r service times will be the only ones at the start of which 
the queue size is less than at the start of any of the preceding n service times, since the queue size at 
each service time before these n is greater than at at least one of the n. 

We can now interpret this result in terms of our original problem. For an arrival pattern is admissible 
if and only if, starting with r in the queue, the queue size would first drop to zero at the start of the 
(n+ 1)th service time, in other words if it would be smaller at the start of the (n+ 1)th service time 
than at the start of any of the first n. Thus there are just r admissible patterns, viz. those which are 
immediately followed by the queue size in our sequence of 2n service times falling for the first time to 
N(k)—1, N(k) — 2, ..., N(k)—r. 








224 Miscellanea 


As an illustration, Fig. 1 shows an example with n = 10, r = 3 and a particular arrival pattern 
1, Gg, ...,@,- The admissible arrival patterns are marked A, Band C. In this example, k = 9, N(k) = N-3. 


This article is published by permission of the Director of Road Research. 


REFERENCES 


Boret, E. (1942). Sur l’emploi du théoréme de Bernoulli pour faciliter le calcul d’un infinité de 
coefficients. Application au probleme de l’attente & un guichet. C.R. Acad. Sci., Paris, 214, 452-6. 
Tanner, J. C. (1953). A problem of interference between two queues. Biometrika, 40, 58-69. 


A theorem in trend analysis 


By M. G. KENDALL 
Research Techniques Division, London School of Economics 


1. The commonest methods of fitting a trend to a series defined at equidistant time-intervals involve 
the use of some form of moving average. In the standard method a polynomial of order p is fitted to 
sets of n (= 2m+1) points by least squares and the value of the polynomial at the middle point is taken 
as the trend value at that point. An account of the method is given in volume 2 of my Advanced Theory 
of Statistics. The same trend point is arrived at whether p is even or the next highest odd integer. 

In particular the fitting of a quadratic or cubic to sets of seven points gives a moving average with 


ee renee #12, 3, 6, 7, 6, 3, —2}. (1) 
The fitting of a quartic or quintic to sets of 17 points (p = 4,5,m = 8) has the weights 
zis 195, — 195, — 260, — 117, 135, 415, 660, 825, 883, ...]. (2) 


2. It is more convenient for most purposes to express these formulae in terms of differences. From 
(1), for example, we have, in an obvious notation, 


#1 — 2, 3, 6, 7, ...Ju,= u—A[ —2, 3, 6, — 14, 6, 3, —2] u, 
= u,— 2 A[2, 5, 2]. (3) 
The fact that the moving average is exact for polynomials of cubic degree implies that the term on the 
extreme right in (3) must be expressible in terms of fourth differences. We determine the weights by 


factorizing the polynomial — 2a + 3054 Gxt — 1403 + 6x? + Ba —2 
into —(x—1)4(2a?+ 52+ 2). 
Likewise (2) can be expressed as 
U + zeepA°[195, 975, 2665, 5148, 7623, 8778, ...]. (4) 


3. There are several advantages in using these forms. As a rule the different series yield smaller 
numbers than the original series and the arithmetic is simpler. Likewise, all the terms within the square 
bracket in expressions like (3) and (4) have the same sign, a result which follows from the later part of 
this paper as limiting property, but appears to be generally true. Most important of all, the expressions 
involving differences give us directly the residuals when trend is abstracted from the series. 

The moving average [2, 5, 2] in (3) can be represented as [2, 1] [1,2], and generally any expression of 
this kind, from its symmetry, can be factorized in the same way. The coefficients in this factorization, 
however, are not always rational. 


4. It will be observed that the weights within square brackets in (3) and (4) are symmetrical unimodal 
patterns and I proceed to prove a rather remarkable theorem concerning their limiting values as the 
number of points n and the order p of the fitting increases: in fact, they tend to the ordinates of a normal 
frequency function taken at equal intervals on either side of the mean. 


5. The orthogonal polynomials for equidistant intervals customarily employed in statistics are 
standardized so that the highest power in x is unity. Thus, if we write 


P,=a-—}(n?-1), 
the next three polynomials are 
P, = Pi-—+;(n*—-1), 
P, = Pi—Fo(3n?—7) Py, 
P, = Pt—+a(3n?— 13) Pi + g3q(n* — 1) (n*?-9). 











aoe oo BD m—_ 

















Miscellanea 
" If we let n be large such that P,/n’ = Q, remains finite we have the limiting values 
3. Q, = Vi-, 
Q3 = Vi- HQ, 
a= = Qt-40! + sts 
and so on. Now these are, in fact, Legendre polynomials defined on the interval — } to } instead of the 
le more usual —1 to +1. The polynomials and this limiting property are, I believe, due to Tchebycheff 
6. (1864). 
6. For my present purpose it will be more convenient to use Legendre polynomials themselves. 
f Consider a time series u(¢) defined (by change of scale of necessary) on the interval — 1 to + 1. Wesuppose 
| the points of observation so close together as to allow summation to be replaced by integrals in this 
interval. We fit a polynomial u = a)+a,P,+...+a,P,, (5) 
where P; is the Legendre polynomial of order 7. In view of the orthogonality property 
1 
| P(x) P,(x) dx = 0 (j + k) 
-1 
=5" Gj=k) (6) 
2 "sn *"™ 
a we have a, = 4(27+ NEw P, Moreover 
ry (—1)8 /j 
| P;(0) = ae ” ) 
ba) 
th =| 
: dh oy = SM (9) 49, P. 7 
1) and hence u( )= 2% 33 4, M2 +1) 2m se (7) 
2) | Only terms of even order survive in this summation. Taking r = 2p we then have 
1 
m 00) = ¥ (7 !) 44+) EP yu (8) 
ae 
| Thus the weight function attached to the variable u to give the trend value u(0) is 
-1 
“0 | + = (* ?) 4045 +1) Py (9) 
he | j=0 4 
by | I proceed to show that this tends to the (2p + 2)th derivative of a normal frequency function. Why this 
should be so is still something of a mystery to me. The sum in (9) is, it should be noticed, an ordinary 
arithmetic sum, not a convolution of random variables for which a central limit effect might be expected 
to generate some kind of normality. 
(4) I 7. The moment of order 2k of (9) is given by 
1 =f 
ler | > ( 7 C ’) 3(4j + 1) *P,,(t) dt. (10) 
are 1j=0 
5 of Odd-number moments, of course, are zero. Now it is known (ef. for example, Edwards, 1922, vol. 2, 
ns p. 902) that 1 
i) ekPdt=0 (7 >k) 
And = 2RHE(DK)NH(4E+ 1)! (7 =k) 
-) k(e— 1) (k—2)...(k-j +1 factors) 
_ Mk=1)(k-2)...(k=J+1)... (9 factors) 5 yy, a1) 
dal | (k+4)(k+8).. .(k+j +4)... (7 +1 factors) 
the | We then have, for the 2kth moment of the weighting function, 
mal . 
nd 1) Me- 1)...k-j+1 |. 
- 4j+ — (j<k). (12) 
re fe * MG) (4p). GHI4D 
re 1 
: i ity. F = find = ——., 
fo is unity. For p = 0 we Pex ok+1 
3(k—1) 
For p = 1 es a 
| r Mak = ~ (9h +1) (2k +3) 
| hiewws sign = << toUE= I= 2) 


2(2k + 1) (2k + 3) (2k +5)" 


Riom. 48 































226 


It is then easy to show by induction that 
»8-5... (29+ 1) (k—1) (b= 2)... (=P) 
(2k+ 1) (2k+43) ...(2k+2p+1) 
Thus the moment generating function of the weight function (9) is 
ve 3.5...2 1(k—1)...(k— 
145 Pe -2p+1( ) ++ (EP) pay 
Apart from the initial value of unity the first terms vanish for k = 1,..., p. 





Miscellanea 





Pox = (-1) (p > 0). 





(13) 


8. Consider the successive terms in (13). If we subtract the first term, unity, corresponding to u, 
itself, we obtain the moment generating function of the residual as 
|: oo «+ p! G27+2 2p+5 
(108-5. (2p +1)... pt oree# ee 1) (2p +3) gy | (P+1)(p+2)(2p+3) (2p + on ote.]. 
2p+1...494+3 4p+5 2!(4p + 5) (4p+7) 
The term in square brackets is, for large p, approximated by 


O? 1 (pe? 
1 -P+5; (=) —etc. = et, 








Thus, for large p the moment generator of the weight function is approximately 
constant. (p02)?+1 e470", (14) 


Now this, with a suitable scale, is the m.g.f. of (d/dx)*?+?e-** the normal function multiplied by the 
Tchebycheff—Hermite polynomial of order 2p+2. And this (continuous) quantity is approximately 
represented by the discontinuous quantity A??+* acting on an approximately normal function. Hence, 
if we fit a polynomial of order 2p the weighting function of the residual is approximated by the (2p + 2)th 
difference of a normal function. This is the theorem. 


9. As a numerical illustration I took the formula for fitting a quartic or quintic to 21 points, which 
has the residual 


260,016 9111-628; 63,308; 192,423; 426,258; 759,003; 1,135,134; 1,460,449; 1,591,294; ...]. (15) 


Taking the figures in square brackets (which total 9,687,700) as the ordinates of a grouped distribution 
I find for the second moment about the mean, with Sheppard’s correction 
2 = 5-638,402 — 0-083,333 = 5-555,069. 


The proportional ordinates of (15) are compared with the ordinates of anormal curve with this variance 
as follows: 


Deviation 

from mean Observed Normal 
0 0°387 0-399 
1 0°355 0°365 
2 0-276 0-278 
3 0-185 0-177 
4 0-1037 0-0945 
5 0-0468 0-0420 
6 0-0154 0-0156 
q 0-00283 0:00485 


For : as low as 2 this seems a very fair fit. 


10. The approximation (14) to the m.g.f. of the weighting function of the residual and the corre- 
sponding expression for the function itself may also be used to provide information about the auto- 
correlations of residual terms. Suppose that the original series w consisted of a polynomial p plus a 
random term. On the scale for which our weighting function is proportional to D®»+#e-4* the auto- 
covariance of the residuals with lagc is then proportional to 


7 (D2?+2 @-t2*) (D27+2 e-He-0) dz 
—@ 








—— 





wh 


Th 


os 
cu 


“~~ - N OY 








' 
' 
! 





Miscellanea 


which, by partial integration, reduces to 


i) (Diets e-¥e) fe He-} de 


—o 


ive) 
= | Fgy+a(2) e-HXa—teP—4e*} 
—o 


ao 
= ee | Hgy44(2) “2 A4AY da. (16) 
—o 
The generating function of H(x) is given by 


© #H,(x) 
j! 





= exp (ta — $7?). 
j=0 
Multiplying by exp { — (a — 4c)?} and integrating over x we find, on some re-arrangement, for the expres- 
sion (16) a constant multiplied by 6 
exp (— c*) Hapi4 (<5) , 


fl. ! 
For c = 0 this is equal to Bh = ake ll 
(2p + 2)!27+2 
and hence the autocorrelation function is 
_ (2p +2)! 220+2 c 
p(c) = ~ (4p+4)! exp (—c*) Haig 2 , (17) 


The correlogram is accordingly the function (17) graphed against c, and has the appearance of a damped 
oscillator. The power spectrum, being the Fourier inversion, will be of type (c*?+4e-“), a bell-shaped 
curve falling rapidly towards zero. 
Similar methods can clearly be used when the original series consisted of a polynomial plus an element 
which was itself autocorrelated. 
REFERENCES 


TcouEeBycuHeErr, P. L. (1864). Sur interpolation. Zapiski Akademii Nauk. 4, Supplement number 5. 
Collected Works, 1, 539. 
Epwarps, J. (1922). The Integral Calculus. London: Macmillan and Co. 


Unbiased estimation of a set of probabilities 


By D. E. BARTON 
University College London 


1. The recent paper by Blythe & Curme (1960) has drawn attention to the possibility of unbiased 
estimation of polynomial functions of a parameter of a distribution with particular reference to the 
notion of completeness introduced by Lehmann & Scheffé (1950). (And it is worthy of notice, in view of 
what follows, that Stuart’s (1955) discussion of his anomaly involves the unbiased estimation of a 
quadratic in the Binomial p parameter.) This has suggested the problem of obtaining unbiased estimators 
(u.b.e.’s) of population probabilities which are functions of the maximum-likelihood estimators of the 
parameters concerned. It is envisaged that these ‘population values’ will be used not only for the 
graduation of data but also to provide ‘expected group frequencies’ from which the x? goodness-of-fit 
test statistic is to be computed. In § 2 below we treat the particular case of the binomial distribution by 
direct construction. In §3 we observe that what we have done may be looked at as an application of 
Rao’s (1945) and Blackwell’s (1947) devicet and the method is then applied to obtain corresponding 
results for the Poisson and normal distributions. 


2. If we have a set of n independent samples 7,, ...,7, from the binomial distribution 
P= p(r) = NC, pg’, r=0,....N (q= 1—p), 


n 
then R=D", 
i=1 
has distribution p(R) = X"Cp pRgh-*-R, 


t This device was first wsed by Rao (1945); it is implicit in equation (3-7) and elsewhere in that paper. 
Blackwell emphasizes that it provided a method of construction of an estimator function. 


‘ 15-2 











228 


It follows that, for integral values of r 
E{RONn— RYN} = (Nn) prqh—, 


Miscellanea 


where R® = R(R-1)...(R—r4+1), ete. 
Thus if PB = R0,N"-RO0y_,/N"Cy, 
Then for all r é(P) =P. 


It is worth noting that one proof of the result that P is the unique function of R which is an u.b.e. of 
P is by appeal to Lehmann & Scheffé’s result that the distribution of R is complete. 


3. More generally let us consider a random variable x whose distribution function depends on a set 
of parameters 6 = (0,,...,0,). Ifaset of n values 2, ...,2, are randomly and independently drawn from 
this and their likelihood admits the sets of statistics t = (t,,...,t,-), with 7 < 7’ <n, which are col- 
lectively sufficient for 6, then let us take an interval [X,, X,] and search for an u.b.e., purely a function 


of t, of the function P=P{X,<2<X;} 
which is a function purely of the unknown 6 and the two specified numbers X,, X,. The application of 
the Blackwell—Rao device is to observe that if m is the number of 2, ...,x, which fall in [X,, X,] then 
a |] 
P =— &(m|t) 
n 


is an u.b.e. of P (since m/n is so, overall). Further, since 
é(m|t) = nP{X, < x < X,|t}, 
where 2 is any of x,,...,%,, then P= P{X, <a < X,|t}. 
Some examples will illustrate the method. 
(a) If x is a binomial variable of unknown parameter p and the interval contains just one integer 


value, r say, then we obtain the expression for pb already given in § 2. 
(b) If zis a Poisson variable of unknown mean J, the sample mean, % say, is sufficient for A and if the 
interval contains just one integer value 7, then 


ms zy n 
P= (nz) (1-2) 





rint n 
(which is easily seen a priori to have the expectation e~4 A‘/r! as it should). A similar result follows in 
the same way for a truncated Poisson distribution. 
(c) If xis normal with unknown mean yz and known standard deviation (taken for convenience to be 
unity), then Z is sufficient for ~ and, since the joint distribution of (x,%) has the bivariate normal form 


-1 
N(#, #, 1, 1/./n, 1/,/n), then conditionally on %, x has the normal distribution N (=. J =") . Therefore 


n(x —%)? 
pa pemeaeen 

x, v(2a(n—1)/n) 
is an u.b.e. of Xs 
ls 


a2 e-h2-p) da, 
_ N20 
(d) In the same way, if x has the normal distribution N(y, 7) and we use the sufficient estimators % and 


of w and @, then the integral over [X,, X,] of the distribution of x conditional on % and s, viz. 


P= (" q Jn : n (x—%\*\t"-* dx 
~ dx, eas ~ (n=1) iF e 


is an u.b.e. of xX 1 1 (x~—p\? 
P= ne Sp eE 
Ji. Jama™| 3( o ) ye 


[The integrand in P is to be interpreted as zero for |z—Z| > s(n—1)/./n.] 

4. When the method of the present note is used to estimate the expected group frequencies for use 
with the x? test it will be seen that it is a relatively easy matter to get the conditional moments of x* 
explicitly, and so to investigate the small sample application of the loss of degrees of freedom rule, but 























Miscellanea 229 


this we will leave for a future note. When the data are ‘grouped up’ the estimators remain u.b.e.’s but 

it will be noted that % and are not the maximum likelihood estimators in the case of grouped normal 

frequencies. 

REFERENCES 

BLACKWELL, D. (1947). Conditional expectation and unbiased sequential estimation. Ann. Math. 
Statist. 18, 105-10. 

BiytueE, C. R. & Cures, G. L. (1960). Estimation of a parameter in the classical occupancy problem. 
Biometrika, 47, 180-5. 

LeaMann, E. L. & Scuerré, H. (1950). Completeness, similar regions and unbiased estimation. 
Sankhyd, 10, 305-40. 

Rao, C. R. (1945). Information and accuracy in the estimation of statistical parameters. Bull. Calcutta 
Math. Soc. 37, 81-81. 

Stuart, A. (1955). A paradox in statistical estimation. Biometrika, 42, 527-9. 











Corrigenda 
Biometrika, (1960), 47, pp. 345-53. 


‘Small sample behaviour of certain tests of the hypothesis 
of equal means under variance heterogeneity.’ 


By R. 8S. McCut.oves, J. GuRLAND and L. RosENBERG 


p. 352, Table 3, last line in third column 
read -27 for -74 


Biometrika, (1960), 47, pp. 469-73. 
‘Some distributions arising in the study of generalized mean differences.’ 


By T. A. RaMASUBBAN 


r+1 r+1 
p. 470, equation (1-4) read > i 63; 
s=1 s+1 
n r 
equation (3-3) read > for >; 
d=0 d=0 
: ($(r—2)] eh. 
equation (3-5) read bs for x (twice). 
j=0 j=0 


p. 471, equation (3-7), Ist line, left-hand side read ,A,, for Ma. 
2nd line, right-hand side read «yA, for (n—1) Ao. 


Biometrika, (1960), 47, pp. 433-37. 
‘Tables for making inferences about the variance of a normal distribution.’ 
By D. V. Liyptey, D. A. East and P. A. Hamimton 
p. 434, five lines from the end of §1 


for o® read o~. 





SS 


= ©. aA ee *§ ODO DOD @B 4 —™ 4 ake = DO CO wat oO oF 


~S -e — he 





} 
| 
| 
: 





Biometrika (1961), 48, 1 and 2, p. 231 
Printed in Great Britain 


Reviews 


The Theory of Storage. By P. A. P. Moran. London: Methuen and Co. Ltd.; New York: 
John Wiley and Sons, Inc. 1959. Pp. 111. 13s. 6d. 


This is the first of the series of Methuen Monographs on Applied Probability and Statistics, edited 
by Prof. Bartlett; it has the same attractive format (and reasonable price) as, for example, the bio- 
logical series, and the mathematical printing is of good standard. 

After introducing some necessary (mainly mathematical) definitions, notations, ideas and techniques 
the book consists largely of a development of the distribution theory arising from the various models 
(of stochastic process type) which have been proposed for dealing with inventory and dam theory. 
It closes with two short chapters: one on Monte Carlo methods (of the direct simulation type) and the 
other on the choice of best or good Rules of Release (of stock or water, as the case may be) by means of 
such examples as that where several dams in series, with electric generators of different efficiencies, 
have to supply a given amount of power with minimal expected loss due to overflow. The Rule of 
Release problem is shown in the examples to reduce to one in Linear Programming or Calculus of 
Variations, and methods of solution of these are discussed or referred to. 

No problems of statistical inference, either of estimation or testing, in relation to actual realizations 
of the stochastic processes, are considered and, indeed, no actual data is given although the models 
(and their shortcomings) are discussed qualitatively in relation to real situations and, also, questions 
of testing the assumptions of the models from ancillary data (e.g. the independence of successive 
annual river-flows) and estimation of their parameters are treated (by the usual methods of elementary 
statistics). This is not a criticism of the author who has served the reader well in digesting some two 
score papers on storage problems but merely reflects the fact that the theory is still in the primitive 
‘model-building’ stage. He points out that, in any event, such necessary data as the annual flow of 
Australian rivers is apt to be too sparse for any exacting conclusions to be drawn and he is careful to 
warn against stretching a model too far. Further, the models have been commonly set up, not to 
describe any existing situation, but to answer the question of what would happen if various rules of 
restocking and release were adopted. 

As is the rule in such theory (which is closely similar to that of queuing) it is relatively easy to set 
up the Kolmogorov or equivalent equations and only slightly more difficult to pick out any embedded 
Markov processes. The main task is the exact or approximate solution of these equations. Again, as 
usual, these are, for the most part, for probabilities of the limiting ‘equilibrium’ probabilities though 
the author gives reasons, not always entirely convincing, to suggest that these will be adequate for the 
real situations envisaged. 

In short, this book is an admirably clear introduction to the probability theory of inventory and 
dam processes, assuming only a knowledge of elementary statistical and stochastic process theory and 
& corresponding level of mathematics. Application is largely to the comparison of inventory or dam 
schemes (by deduction of aspects of their behaviour) rather than to calibration of existing processes. 


D. E. BARTON 


Mathematical Methods in the Theory of Queueing. By A. Y. Kurncutne. London: 
Charles Griffin and Co. Ltd. 1960. Pp. 120. 32s. 


Khintchine’s book has been translated from the Russian by Messrs D. M. Andrews and M. H. 
Quenouille and is one of Griffin’s series of statistical monographs. In general terms the book is an 
elegant, original and rigorous development dealing largely with certain results of Erlang and Palm. 
Only a few rather special problems are treated; the recent flood of mathematical work on queueing 
is not dealt with. In fact, the book is a specialist work for mathematicians who are not experts in 
stochastic processes and is not a text-book for operational research workers; the latter fact is of course 
precisely conveyed in the title. 

The book is in three parts. Part I, which occupies «bout half the book, deals solely with the proper- 
ties of ‘streams’, i.e. of the series of arrivals which form the input into the queue. Chapter 1 derives 
some simple properties of the Poisson process or, as it is called here, the simple stream. The cases of 














232 


constant and variable intensity are dealt with. Chapter 2 is devoted mainly to proving that the only 
stream that satisfies certain conditions including the ‘absence of after-effects’ is what may be called 
an aggregated Poisson stream, i.e. consists of a Poisson process of ‘points’ together with a distribution 
determining the number of arrivals per point. Chapter 3 continues these ideas. Chapter 4 deals with 
streams with limited after effects. This is what in the British and American literature is called the 
‘general independent’ arrival pattern, i.e. the intervals between successive arrivals are independent 
identically distributed random variables. Chapter 5 is devoted to a careful proof of the result that the 
superposition of a large number of independent processes leads, under very general conditions on the 
individual processes, to a Poisson process. Part I is thus an essay on ‘point processes’, and not con- 
cerned specifically with queueing. 

The second half of the book deals with queueing processes as such, part II with systems with losses 
and part III with systems allowing delay. In part II the first chapter, Chapter 6, deals with a random 
stream of calls entering a system with n lines, the distribution of service-time being exponential. 
A call arriving when all servers are busy is lost. The formulation of the equilibrium equations is very 
carefully discussed and the ergodic interpretation of the equilibrium probabilities proved. Incidentally 
the open problem mentioned at the top of p. 69 has been solved by Karlin (in Studies in the Mathe- 
matical Theory of Inventory and Production, editors J. Arrow et al., Stanford University Press). 
Chapter 7 is concerned with the system with infinitely many lines and Chapter 8 with some interesting 
results concerning the series of events entering the rth line, when calls are assumed to occupy the line 
with lowest serial number amongst those free at the moment of entry. In the last three chapters, 
part III, systems with waiting are discussed, the three chapters dealing respectively with exponential 
service-time, and with the constant service-time single-server queue, and with the general service-time, 
single-server queue. 

The mathematical style of the book is of the highest quality. Results are precisely stated, but the 
mathematics is kept to an elementary level and irrelevant abstractness avoided. The author was, of 
course, until his recent death, one of the most eminent of the Soviet school of probabilists and his 
interest in queueing problems dates from the early 1930's. 

The translation is on the whole very clear indeed. A few minor points: the T. Fray mentioned in the 
bibliography and once or twice in the text may well be T. Fry, the American author of the well-known 
book on probability ; at the top of p. 19 there seems to be an unnecessary ‘not’, and there is an obvious 
slip in the final paragraph of p. 60. 


Reviews 


D. R. COX 


Regression Analysis. By E. J. Wmt1ams. London: Chapman and Hall; New York: 
John Wiley and Sons, Inc. 1959, Pp. 214. 60s. 


‘This book is addressed primarily to research workers in the experimental sciences. The problems 
with which it deals have arisen from consultation with such workers,...’—so states the author 
in his Preface (dated September 1959). In at least one quarter this book has been dismissed rather 
summarily in terms of ‘...the point of view is that which existed before the advent of statistical 
decision theory’. It is true that the comments given by the author (§ 2.9) on the problem of allocating 
the values of the independent variable are somewhat brief and one might have expected some reference 
to the few papers which did exist on this topic—by Elfving, Chernoff, Hoel and Guest. But it would 
have been difficult for the author to have included the recent and substantial paper by Kiefer & 
Wolfowitz [Ann. Math Statist. (1959), 30, 271-94]. Hence this is really not a satisfactory ground on 
which to pass judgement. The point that gives this reviewer cause for some concern is the rather 
sweeping title in relation to the material actually found in the book; although this material does 
apparently fulfil the standpoint of the author as given in the Preface. 

The eleven chapters of this book are intended to pivot round the seventh on Analysis of Covariance. 
The Introduction, dealing with some general considerations relating to tests of significance, is followed 
by three chapters on Linear Regression, Multiple and Polynomial Regression and Regression Equations 
requiring Iterative Calculation. The fifth and sixth chapters discuss Choice among Regression Formulas 
and Estimation from the Regression Equation. The four chapters following the one on covariance— 
which itself is introduced as a variant ..f multiple regression—are intended to help with the treatment 
of important multivariate problems. In the eighth chapter is a Treatment of Heterogeneous Data and 
this is followed by three chapters on Simultaneous Regression Equations, Discriminant Functions and 
Functional Relations. A list of seventy-five references covering a period from the early 1930’s to 1958 
together with an Index in the standard form complete the book. 





a ae ee ee le ee ee le el ee 


~~ 2 oe oe Ut 2 ot Ae ze 


i i i. el, oe 











233 


There are a number of interesting features about Dr Williams’s book. He uses a fiducial argument 
throughout which induces a rather ‘cathedral-like’ atmosphere into the presentation of some sections. 
Moreover, because the reader is assumed (vide Preface) to be familiar with—or have ready access to— 
general statistical methods, some of the drafting is ex cathedra to the point of being brutal (to the 
experimental scientist in mind). Topics covered in Chapter 4 include exponential and hyperbolic 
regression and the author may regret that a group of important papers on the former topic should have 
appeared just as his book was in Press. Chapter 5 (‘Choice among Regression Formulas’) is not quite 
that which it may seem. In fact it deals with how different independent variables and forms of 
regression functions may be compared and not the choice among a set of alternatives of a regression 
formula appropriate for the description of a collection of data. In the latter part of the book the reader 
will find discussions on tests of concurrence and of proportionality of regression lines; comparison of 
regressions derived from series which are cross-correlated; simultaneous regression equations in the 
field of chemistry and the method of instrumental variables applied to a problem of metal surface 
finishing. 

All this is most valuable and within the declared aims of the author. But does it justify the broad 
title under which the book is being sold? There are many well-known regression problems of importance 
to research workers which are not included in this book but which would reasonably be expected to be 
found in a monograph taking the title in question. As the book literature of statistical theory and method 
changes its emphasis from the general treatise to the monograph, an author will need to describe his 
work more closely and the publisher accept the (probably extended) title although it may appear 
superficially less attractive on the advertising material. Nevertheless, Dr Williams is to be thanked 
for making available his experiences in this field of statistics and congratulated on producing a co- 
ordinated review of regression methods which can be recommended to those for whom he writes and 
others with similar problems. 


Reviews 


WM. R. BUCKLAND 


Elementary Decision Theory. By Herman CuERNorrF and Lincotn E. Mosss. New 
York: John Wiley and Sons, Inc.; London: Chapman and Hall, Ltd. 1959. Pp. xv + 
364. 60s, 


The authors suggest as @ first course in statistics from the decision making point of view an intro- 
ductory chapter, and then chapters on data processing, probability and random variables, utility and 
descriptive statistics, uncertainty due to ignorance of the state of nature, the computation of Bayes 
strategies, and introduction to classical statistics. The book ends with a valuable chapter on models, 
and two rather condensed but comparatively standard chapters on hypothesis testing and estimation 
and confidence intervals. There are eight tables, eighteen appendices, a partial list of answers to 
exercises, and an index. 

It is easy to criticize the book. The reader is expected to need a description of graph paper, and yet 
in the introductory chapter, seven pages suffice for the introduction (in the discussion of an example) 
of the concepts of available actions, states of nature, losses, an experiment, observations, frequency 
of responses, strategies, average loss, action probabilities, domination of strategies, admissibility, and 
the minimax rule. All this, before any mention of probability or expectation. Throughout the book, 
concepts are introduced and discussed in terms of the curious family and communal lives of the in- 
habitants of the Phiggins towns, the mythical nature of which becomes the more evident as the book 
progresses. 

There seems to have been a difference of opinion between the authors, who claim to be writing a book 
about statistics, and the publishers, who call it a book of related interest to statisticians. Certainly it 
would be impossible to obtain an adequate grasp of what is usually called elementary statistics from 
this book alone. The normal mean test and the ¢ test are both dismissed in a single exercise; the 
x? distribution occurs only in the solution of an exercise; the existence of non-parametric methods is 
barely mentioned. Nevertheless, for students with some more orthodox grounding, this book, by 
obvious enthusiasts, may prove stimulating. C. L. MALLOWS 











234 Reviews 


Individual Choice Behaviour. By R. Duncan Luce. New York: John Wiley and Sons, 
Inc.; London: Chapman and Hall, Ltd. 1959. Pp. xii+ 153. 48s. 


This monograph contains an axiomatic approach to the probability models used in modern experi- 
mental psychology. It is thus concerned with how an individual actually does behave in uncertain 
situations, rather than with how an idealized ‘rational’ individual would behave. The basic axiom of 
choice behaviour is simply that in cases of imperfect discrimination, conditional choice probabilities 
can be defined in the usual way; in this context, this is not a tautology, since two distinct experiments 
would be required to verify the axiom. The consequences of this axiom are worked out with complete 
rigour and in great detail; it is shown to imply the existence of a ratio scale over all the possible choices. 
Further chapters describe applications of the basic theory to psychophysics, to utility theory (where 
a theory of decomposable preference structures is developed which leads to some remarkable predic- 
tions), and to learning theory (where three distinct learning models are constructed). The book ends 
with an excellent short chapter of summary and conclusions, three mathematical appendices, a list 
of open problems, bibliography and index. 

As an exercise in axiomatics, the book is a model. With one exception, every assumption made is 
stated with full precision and is carefully discussed; experiments for testing the theory are described 
and proposed. The exception occurs in the section on response bias, where an assumption similar to 
that of ‘no interaction’ in a factorial experiment is made, and is subsequently used without discussion 
in the construction of models for signal detectability experiments. 

The author has ‘strong'reservations’ about the detailed accuracy of the expected utility hypothesis; 
for statisticians who are not psychologists, it is in his remarks on this subject, and in the appeal of 
@ careful discussion of a basic probability theory in an unfamiliar context, that the value of the book 


will mainly lie. Cc. L. MALLOWS 


Principles and Procedures of Statistics (with special reference to the biological 
sciences). By R. G. D. Stren and J. H. Torrie. London, New York and Toronto: 
McGraw-Hill Publishing Co. 1960. Pp. xvi+481. 81s. 6d. 


This is a ‘cook-book’ and a good one for the use of biologists who wish to acquire the elements of 
biometry. The statistical techniques are presented with the minimum of algebraic manipulation, the 
procedures to be adopted are clear and the whole is illustrated with many actual illustrations, most of 
which have the merit, to the reviewer at least, of being fresh material. 

Biologists, no matter from which side of the Atlantic they come, are timorous about algebra and 
even about algebraic notation, a trepidation which they often cover up with an aggressive disdain for 
abstract thought. It is difficult to see how Messrs Steel and Torrie could have used less symbolism 
than they have, but the average biologist will not find the book easy to read. For those who are forced 
to apply the techniques set out, understanding will come through hard work. For those who merely 
want to learn a smattering of a new subject something easier will need to be sought after. 

The scope of the book is wide. There are the usual descriptive statistics, probability, distributions of 
sample criteria, experimental design, analysis of variance (one-way and multi-way classifications), 
linear regression and correlation, more analysis of variance including factorial experiments, splitplots 
and unequal subclass numbers, multiple and partial regression and correlation, analysis of covariance, 
nonlinear regression, x? criterion and its various applications, the hypergeometric, binomial and 
Poisson distributions, non-parametric statistics and stratified sampling. The first chapter on the defini- 
tion of statistics and its history reads a little oddly and could with advantage have been omiited. The 
readers of this book are not going to ask themselves what statistics is (or are) but merely how the 
methods are used. 

The book can be recommended. F. N. DAVID 


The Statistical Basis of Quality Control Charts. By 8. K. Examparam. India, 
London and New York: Asia Publishing House. 1960. Pp. x+96. 16s. 6d. 
The author of this short book sets himself a stiff task. His opening sentence in the preface reads 


‘This book is written for the busy executive in the higher levels of Management who desires to have 
the general picture of the technicalities of quality control in industry, but has no time to go into the 





~—_— — —-— « 


yay <_S» -_- 


~~ 





Reviews 235 


details of quality control work.’ Some idea of how the author tries to achieve this can be gained from 
the selection of topics discussed, which fall into three groups. The first third of the book deals witli: 
basic statistical theory such as frequency distributions, variation, moments, measures of skewness and 
so forth. The next third deals with control charts based on the mean and range of small samples, 
whilst the final third deals with charts based on the observed fraction defective in samples. 

For the executive, the book’s great weakness is the unnecessary amount of detailed algebra that is 
included. The derivation of the moments about the mean from crude moments is dealt with extensively, 
and the first four theoretical moments of the normal, binomial and Poisson distributions are all 
derived and studied in considerable detail. This approach tends to dull the appetite for quality control. 
Although a lot of this algebra could be skipped in reading the book, the continuity would be broken. 
If, however, the executive does persevere there are some good case studies, five in number, which bring 
out clearly the principles involved and it is a pity that the book was not primarily written around these 
studies, rather than around the three basic distributions mentioned above. A brief discussion of how 
to select a quality control scheme, how to organize a quality control section, and how to assess its 
efficiency, would also have been helpful to the executive. 

The practising statistician will only find this book of very general interest. There is no mention of 
double sampling or sequential schemes, modified limits, demerit schemes or the more modern forms 
of continuous process schemes based on cumulative charts. All charts discussed use the American 
system of 3-sigma limits, and warning limits are not used. No tables giving the constants required 
for control limits are included in the book. Finally, the operating characteristic is only briefly touched 
upon in the last chapter, and there is no detailed discussion of how to set about choosing an appropriate 


scheme, or how to assess the economics of alternative schemes. P. G. MOORE 


An Introduction to Mathematical Statistics. By H. D. Brunk. Massachusetts: Ginn 
and Company. 1960. Pp. xi+403. $7.00. 


The author states that his book was written for use in a one-semester, 3 hr. course which is meant as 
an introduction to mathematical statistics. Prerequisites include a knowledge of the caleulus—and 
the author might have added algebra—but nothing in probability and statistics, which means that the 
student can start de novo. This states the level adequately for American readers. As far as the English 
student is concerned he will need Pure Mathematics at G.C.E. A-Level at a fairly good standard. The 
author has added starred sections and chapters to his original planned course. These will require more 
mathematical knowledge than is necessary for the unstarred sections. The book is in two parts, 
probability and statistics. 

In probability we have elementary probability spaces, general probability spaces, random variables, 
combined random variables and expectation techniques. Under statistics we find random sampling, 
law of large numbers (including Tcheby-chef), estimation, central limit theorem, confidence intervals 
and testing hypothesis, decision theory, regression, distributions of various statistical criteria, experi- 
mental design, quality control and distribution-free methods. The whole is topped up with a number of 
useful statistical tables. The level of exposition is adequate and there are some numerical illustrations. 


F. N. DAVID 


An Introduction to Mathematical Statistics. By R. V. Hoae and A. T. Graia. New 
York and London: The Macmillan Company, New York. 1960. Pp. ix +245. 47s. 


The authors state that this book is intended for a two-semester course for undergraduate students 
with some mathematical preparation. They indicate which sections will be found useful if it is desired 
to cut the course down to one semester. From the English point of view the students will require 
little more than advanced level mathematics and the book will certainly be within the compass of the 
first year general student. 

Probability and statistics are intermingled throughout in a way which comes naturally. The topics 
treated are random variables and expectations, the binomial, Poisson, normal and gamma distributions, 
density functions, transformations of variables, estimation sufficiency, and stochastic independence, 
limiting distributions, distribution-free problems, tests of statistical hypotheses, and multivariate 
distributions. Tables of the normal, y?, ¢ and F distributions are given. 

The mathematical exposition is clear and the seeker for light on the mathematics of statistics will 


find this book meets his needs admirably. F. N. DAVID 











236 Reviews 


Genetical Research. Vol. 1, no. 1. London and New York: Cambridge University Press. 
1960. Pp. 172. Subscription (three parts annually) £5 or $17.50. 


The rapid growth of genetics in recent years has necessitated changes in existing journals. Genetics 
has lately considerably increased in size, and Heredity this year appears in two volumes rather than 
one preparatory to an increase in girth. The need of additional vehicles for publication has now led to 
the emergence of this new journal. The editorial board includes C. H. Waddington (chairman), 
Charlotte Auerbach, H. Griineberg, W. Hayes, D. Lewis, G. Pontecorvo and A. Robertson, with 
E.C. R. Reeve as Executive Editor. The journal is open to authors of all nationalities, but papers must 
be written in English. The twelve papers which comprise the first number range over almost the whole 
field of genetics and deal with organisms as diverse as fungi (Neurospora, Coprinus), protozoa (Para- 
mecium), Drosophila, the mouse and the rabbit. As was to be expected of the Cambridge University 
Press, the production of the new journal is of high quality. H. GRUNEBERG 


La Biométrie. Que sais-je. (Le point des connaissances actuelles, no. 871.) By EuGENE 
ScHREIDER. Paris: Presses Universitaires de France. 1960. Pp. 126. 


This small paper-back is a recent addition to the ‘Que sais-je?’ series. The author’s objects are to 
refute the contention that biometry is a dull, uninteresting science (‘une science aride’) and to show 
the non-specialist reader something of the nature and scope of biometrical research. 

The contents are divided into four parts, the first of which dgals with some elementary statistical 
procedures, while the remaining three are concerned with their application to biological data. Most of 
the examples are taken from the human species. There are seven figures and various tables, but the 
book lacks an index. Somewhat surprisingly the author has made some use (Tables III and IV) of the 
now rarely used coefficient of variation. In other respects the book is up to date and recent researches 
are cited, including some by the author. 

The section on variability in populations due to heredity and environment, and that dealing with 
changes in man and illustrated by the increase in height in various countries over a period of years, 
are of general interest. 8. B. HOLT 


Introduction to Health Statistics. By Satya Swaroop. Edinburgh and London: 
K. and 8. Livingstone, Ltd. 1960. Pp. 343. 40s. 


Dr Swaroop, who is a statistician working with the World Health Organization, has used his very 
full knowledge of the organization, collection and use of health statistics to produce an excellent 
reference book on the subject. It summarizes well the various recommendations and findings of the 
Expert Committees, in one volume. Whether, however, it is suitable for use by students is questionable. 
Its style is not well adapted to easy reading. The text is interspersed with lists and indications for 
procedures, which are useful on reference and when learning off by heart, but not otherwise. It is, also, 
in places, difficult to evaluate the dangers, drawb: ks and fallacies of certain procedures, particularly 
in the sections on census procedures and morbidity surveys. The book contains a full account of the 
types of data the World Health Organization recommends should be collected, with details of the 
necessary forms, discussion of the use and need for health statistics, methods of collection of the data 
and how it should be analysed and some details of the necessary calculations and diagrams. The 
emphasis in this is particularly on the methods recommended and used by W.H.O. and, therefore, 
of more use to an individual starting a health service de novo, rather than working in an existing 
organization. The appendices contain various international comparisons of birth and death rates, a 
very full reference list of the methods of vital statistics, and a list of some recent W.H.O. expert 
committees and international regulations. Ww. W. HOLLAND 














———— 








| 
| 
I 





Reviews 237 


Special Functions. By Eart D. Rarnvittr. New York and London: The Macmillan 
Company, New York. 1960. Pp. xii+365. 82s. 


The statistics student, who is not a mathematics specialist, is often brought up short in his study 
of mathematical statistics by various manipulations of the special functions which lie in wait in many 
branches of the theory. And elucidation from mathematical text-books is not easy because the 
delineation of the special bit of theory required is usually tied up with a great deal which is irrelevant 
to the particular purpose in hand. This book, pitched at just about the right level of exposition, is 
excellent, and can serve as supplementary reading for any searcher after mathematical statistical 
truths. 

The contents cover (briefly), infinite products, gamma and beta functions, asymptotic series, the 
hypergeometric functions, Bessel functions, generalized and confluent hypergeometric functions, 
generating functions, orthogonal, Legendre, Hermite, Laguerre and Jacobi polynomials, elliptic and 
theta functions. There are exercises and examples given at the end of most chapters. 

Mr Rainville says he has written his book to be of use to physicists, engineers and chemists. The 


book can be unreservedly recommended to statisticians also. ¥. N. DAVID 


The Theory of Matrices. By F. R. Ganrmacuer. New York: Chelsea Publishing 
Company. 1959. Vol. 1. Pp. 374. Vol. 2. Pp. 276. $6.00 each. 


This excellent translation from the Russian, made by Prof. K. A. Hirsch, will be welcomed by the 
large and growing number of mathematicians, scientists, and statisticians who work with matrices. 
No knowledge of matrix theory is assumed, but previous knowledge of determinants and of the theory 
of solution of linear equations is essential. This may limit the usefulness of the book to beginners who 
are interested in learning linear algebra in a strictly iogical order (it is, for instance, taken for granted 
in the book that the space of n-tuples of complex numbers is n-dimensional) ; there is a tendency in the 
book to deduce elementary properties of matrices from rather more advanced theory than is necessary 
(the Cauchy—Binet formula to prove that the rank of AB cannot exceed that of A or B). But for those 
mainly interested in applications, these pedagogic objections are of little importance: the book really 
has advanced aims. 

The mathematical treatment is everywhere rigorous, lucid, and amply illustrated by numerical 
examples. Much attention is given to the serious problems of numerical analysis which abound in the 
applications, for instance, that of effectively computing the characteristic function of a matrix. 

The matrix as an expression of a linear operator, both with and without metric notions, is given 
good prominence. Thus, an exhaustive analytic account of the theory of elementary divisors is supple- 
mented by an illuminating geometric exposition. Many applications discussed relate to lifferential 
equations, in particular, to the stability of solutions. The Routh—Hurwitz method for deciding whether 
the eigenvalues of a matrix have negative real-parts is discussed fully, together with the contributions 
of Hermite and Lyapunov to this important practical problem (the theory of quadratic and of Hankel 
forms is here brought in). A chapter on matrices with non-negative elements (in particular, stochastic 
matrices) includes the remarkable theorems of Perron and Frobenius on the eigenvalues of such 
matrices. There is an excellent account of matrix equations and of functions of matrices: in these the 
reduction to Jordan normal form is made to play an important part. A compendious bibliography lists 


some four hundred titles, including papers as recent as 1959. H. KESTELMAN 


Theory and Solution of Ordinary Differential Equations. By DonaLp GREENSPAN. 
New York and London: The Macmillan Company, New York. 1960. Pp. viii+148. 
38s. 6d. 


With the growth of the ‘dynamics’ of statistical method commonly but not necessarily correctly 
lumped together as ‘stochastic processes’ the interest of the student of mathematical statistics in the 
theory of differential equations has perforce been sharpened. This book is written for those who know 
a reasonable amount of advanced calculus but not all of it will be useful to those who seek to acquire 
mathematical techniques rather than mathematical learning. 


\ 





238 Reviews 


The topics covered are first-order equations, linear differential equations of the second and nth 
order, existence theory, linear systems, special functions, approximate solutions, eigenfunctions and 
Fourier series. The delineation is clear and the book is eminently readable. Exercises are given at the 
end of each chapter and the answers supplied. It should serve as a useful introduction to the subject 
for mathematicians. The ‘user’ student may find it too difficult. F. N. DAVID 


Pronterari per Calcoli Statistici: Tavole Numeriche e Complementi. By Sitvio 
ViANELLI. Palermo, Italy: Ed. Abbaco. 1960. Pp. xv +1543. 16,000 lire. 


This monumental piece of compilation consists of reprints of all tables which have been calculated 
for or have been found useful in the application of statistical theory, plus a resumé of the theory for 
which the tables was calculated where it is appropriate. The tables occupy nearly 1100 pages and 
the exposition a further 400. A few of the tables appear not to have been previously published. The 
printing is entirely clear and one can only marvel at the energy of the compiler and the labours of the 
proof readers. All libraries will need to buy this book if only for record purposes. 

This being said the reviewer would like to offer a few suggestions which would improve the book from 
the point of view of the user. Judging from the relative number of statistical papers published in 
different languages the majority of users will read the English language. It is, therefore, important 
that the index to both tables and exposition should be also in English and not solely in Italian as it is 
at present. It is true that symbols are universal esperanto and that Italian is easy to read but the lack 
of an adequate index may be a stumbling block to many. 

The tables range from the utilitarian—powers of natural numbers, random numbers, etc.—to those 
of modern techniques only just now passing into general use, such as the Kolmogoroff—Smirnoff 
criterion. The reviewer was puzzled to find a logical grouping. It is, as things are at present, very 
difficult to find any particular table without having to search through the whole index and this detracts 
considerably from the general utility of the book. The expository material is divided according to 
subject matter with a separate index given for each section which is reasonable to use provided one 
classifies statistical theory in the same way as the author. A similar set of indices for the tables might 
be possible, with repetition of entry where this was found necessary. 

The physical size of this book will make it difficult to handle and one wonders if consideration was 
given to producing not one volume but say six or seven. Questions of cost will undoubtedly have 
entered into this, however, and one can have every sympathy with the author. But all these are minor 
reflexions. The compiler-author is to be congratulated on his enterprise and for those wealthy enough 
in the statistical world to be able to afford £9. 8s. 3d. this is a bargain of bargains. F. N. DAVID 








R 


Cc 


St 


Sar 


Fin 


Syr 


Stic 


Seq 





A a 











Other books received 


Introduction to Probability and Statistics. By Henry L. Apter and Epwarp B. Rosser. 
San Francisco and London: W. H. Freeman and Co. 1960. Pp. xi+252. 20s. 


Engineering Statistics. By Atsert H. Bowker and Greratp J. Lieperman. New Jersey, U.S.A.: 
Prentice-Hall, Inc. 1959. Pp. 585. 88s. 


Information and Design Processes. Ed. Ropert E. Macnoxn. New York, Toronto and London: 
McGraw-Hill Book Company, Inc. 1960. Pp. 185. 46s. 


Regression Analysis. By R. L. PLackxerr. Oxford: Clarendon Press and Oxford University Press. 
1960. Pp. 173. 35s. 


Introduction to Linear Programming. By Watrer W. Garvin. New York, Toronto and London: 
McGraw-Hill Book Company, Inc. 1960. Pp. 281. 68s. 


The Use of Economic Statistics. By C. A. Buyru. The Minerva Series, No. 5. 1960. Pp. 249. 28s. 
(cloth), or 22s. (board). 


Stochastic Population Models in Ecology and Epidemiology. By M. 8S. Barrierr. London: 
Methuen and Co. Ltd. New York: John Wiley and Sons, Inc. 1960. Pp. 90. 12s. 6d. 


Contributions to Probability and Statistics. Essays in honor of Harold Hotelling. Stanford studies 
in mathematics and statistics. II. Ed. I. Otxrn. California: Stanford University Press. London: 
Oxford University Press. 1960. Pp. 517. 52s. 


Statistical Processes and Reliability Engineering. By D. N. Cuororas. Princeton, New York, 
Toronto, London: D. van Nostrand Co. Inc. 1960. Pp. 438. 96s. 


An Introduction to Statistical Communication Theory. By Davin MippiETon. New York, 
Toronto, London: McGraw-Hill Book Company, Inc. 1960. Pp. 1140. £9. 14s. 


Fundamentals of College Algebra. By Wiiu1am H. Durrer. New York and London: The 
Macmillan Company, New York. 1960. Pp. 250. 31s. 6d. 


Applied Boolean Algebra. An Elementary Introduction. By Franz E. Honn. New York and 
London: The Macmillan Company, New York. 1960. Pp. 189. 17s. 6d. 


An Introduction to the Mathematics of Medicine and Biology. By J. D. Drrarres and 
I. N. SNEppon. Amsterdam: North-Holland Publishing Company. 1960. Pp. 663. 85s. 


Sampling Methods for Censuses and Surveys. (Third edition.) By F. Yates. London: Charles 
Griffin and Company. 1960. Pp. 440. 54s. 


Finite-difference Methods for Partial Differential Equations. By Grorcr E. ForsyTHE and 
Wotreane R. Watson. London and New York: John Wiley & Sons, Inc. 1960. Pp. 444. 92s. 
or $11.50. 


Symposia of the Society for Experimental Biology. No. XIV. Models and Analogues in Biology. 
Ed. J. W. L. Beament. Cambridge University Press. 1960. Pp. 255. 50s. 


Stichproben in der Amtlichen Statistik. Ed. Geruarp First. Stuttgart and Mainz: W. 
Kohlhammer 1960. Pp. 626. DM 28. 


Sequential Medical Trials. By P. Armitacr. Oxford: Blackwell Scientific Publications. 1960. 
Pp. 105. 20s. 











BAC 


Cheqr 
THE 
GOV 
ates sl 
of the 








| 

























BIOMETRIKA 





BACK ISSUES 


Volumes 43-47 (1956-1960) 


These may be obtained from the BIOMETRIKA OFFICE, at the following prices, which include 
packing and postage: 


£3. 10s. (or $10.00) per wrappered volume. 
Bound volumes: £1. 8s. (or $4.00) extra per volume. 
Binding cases: 14s. (or $2.00) each. 


Cheques should be made payable to Biometrika, crossed ‘a/c BIOMETRIKA TRUST’ and sent to 
THE SECRETARY, BIOMETRIKA OFFICE, UNIVERSITY COLLEGE LONDON, 
GOWER STREET, LONDON, W.C. 1, to whom all orders for subscriptions and author’s separ- 
ates. should also be sent (see inside front cover). The Secretary also has available for sale copies of many 
of the papers and tables published in earlier numbers of Biometrika (see pages (iv), (v) and (vi)). 








Volumes 20A, 20B, 21-42 (1928-1955) 


Messrs Wm. Dawson & Sons, Ltd., have been appointed agents for the sale of these 
volumes, which may be obtained from the address below at the following prices, packing and 
postage extra: 





£5. 5s. (or $15.00) per wrappered volume. 

Bound volumes: £1. 8s. (or $4.00) extra per volume. 

Binding cases: 14s. (or $2.00) each. | 
Index (Subjects Vols. 1-37; Authors Vols. 1-43): 6s. (or $1.00). 


Volumes 1-19 (1902-1927) 


Messrs Wm. Dawson & Sons, Ltd., have been granted permission to reprint Vols. 1-19. 
These are ready for distribution, price £130 bound. Single vols. £7. 10s. each. Would 
librarians and others wishing to have copies please place their orders with: i 








WM. DAWSON & SONS, LTD., 
16 WEST STREET, FARNHAM, SURREY, ENGLAND 

















(All rights reserved) 
ae 


BIOMETRIKA, Vol. 48, Parts 1 and 2 


CONTENTS 


4 PAG 

M. G. Kenpatu. Studies in the ne ge of peeey and statistics. XI. Daniel Bernoulli on maximum 7 
likelihood . ‘ Z A 

F. N. Davin and C. L. Marzows. The variance of Spearman’ 8 ory in ncenial wamplen 

E. C. Frecier and E. 8. Pearson. Tests for rank correlation coefficients. II . 

J. Dursry. Some methods of constructing exact tests . 

C. R. Heatucore. Preemptive priority queueing . ‘ : ¥ 

J. Hasna. A two-sample sequential t-test 

8S. Nasreya. Absolute and incomplete moments of the mailtiyGGiake normal dhicibation . 

Joun 8S. WuirEe. Asymptotic expansions for the mean and variance of the serial correlation eniiliciens 

T. H. Srarxs and H. A. Davin. Significance tests for paired-comparison experiments . , ° 

G. S. Watson. Goodness-of-fit tests on a circle 

H. T. Gontn. The use of orthogonal polynomials of the positive end negative binomial frequency fanetions i in 
curve fitting by Aitken’s method . ‘ ‘ 

A. M. W. VeruaceEn. The estimation of seqression anid error- esse ciate: when the joint distribution 
of the errors is of any continuous form and known apart from a scale parameter . : ° . ° 

C. L. Matitows. Latent vectors of random symmetric matrices 

H. Leon Harter. Expected values of normal order statistics 

Frank A. Haicut. A distribution analogous to the Borel-Tanner . 

W. L. Nicnotson. Occupancy probability distribution critical points 

Masasui OxamoTo and Goro Isur. Test of independence in intraclass 2 x 2 tables 


MiscELLANEA 

Coun R. Buyts and Davip W. Hurcurnson. Tables of Neyman-shortest unbiased confidence intervals for 
the Poisson parameter. ; ° 

K. C. SreepHARAN Pr~iar and Awoxzzs R. BUENAVENTURA. ‘Upper peroontage points of a substitute 
F-ratio using ranges 

A. M. Watker. On Durbin’ 8 formule for ‘the limiting generalized variance of a sample of consecutive 
observations from a moving-average process . 5 

D. E. Barron and F. N. Davin. The central sampling. mornents of the mean in samples from a finite popm- 
lation (Aty’s formulae and Madow’s central limit) . x ; . ° 3 ‘ 

Man Mouan Sonpsi. A note on the quadrivariate normal integral 

J. Gant. On the stochastic matrix in a genetic model of Moran . 

W. J. Ewens. Departures from assumption in sequential analysis 

J.C. Gower. A note on some asymptotic properties of the logarithmic series dittbution . 

M. ArrqutLaH. On a property of balanced designs. 

M. J. R. Hearty and J. C. Gower. Aliasing in partially éoutbiaided factorial experiments . : 

M. G. KenpAatu. Studies in the History of Probability and Statistics. XII. The Book of Fate . 

J.C. Tanner. A derivation of the Borel distribution . 4 x : ° ° 

M. G. Kenpatu. A theorem in trend analysis 

D. E. Barron. Unbiased estimation of a set of probabilities 


CoRRIGENDA 


REVIEWS 
P. A. P. Moran. The Theory of Storage. 
A. Y. Kurncainse. Mathematical Methods in the "Theory of Queusing 
E. J. Wrtu1ams. Regression Analysis . 7 
Herman CHarnorrF and Lincoxn E. Moszs. Elementary Decision Theory : 
R. Duncan Luce. Individual Choice Behaviour . 4 ‘ 
R. G. D. Sreet and J. H. Torrie. Principles and Procedures of ‘Statistics ° 
8S. K. ExampBaraom. The Statistical Basis of Quality Control Charts 
H. D. Brunx. An Introduction to Mathematical Statistics . ; . 
R. V. Hoae and A. T. Crate. An Introduction to Mathematical Statistics ; 
Genetical Research. Vol. 1, no. 1. ‘ ‘ : é 3 
Evceitne ScurempEr. La Biométzie. Que anie-je ° 
Satya Swaroop. Introduction to Health Statistics 
Eart D. Rarnvitxe. Special Functions ; 
F. R. Ganrmacuer. The Theory of Matrices ‘ 
Donatp GREENSPAN. Theory and Solution of Ordinary Differential Equations 
Srivio VIANELLI. Pronterari per Calcoli Statistici: Tavole Numeriche e Complementi 


Orner Booxs REcEIvVED ‘ F . e . . > : > 5 ° 


Printed in Great Britain at the University Press, Cambridge (Brooke Crutchley, University Printer) 








